Skip to main content

USING DATA MINING TO ANALYZE LARGE DATA SETS IN HIGHER EDUCATION RESEARCH: AN EXAMPLE OF PREDICTION WITH NSOPF: 99

  • Chapter
Book cover HIGHER EDUCATION:

Part of the book series: Higher Education: Handbook of Theory and Research ((HATR,volume 21))

  • 4814 Accesses

Abstract

The recent advancement in computing technology, availability of low-cost storage devices, and popularity of Internet have empowered data acquisition that is substantially different from the traditional approach (Hand, Mannila, and Smyth, 2001;Wegman, 1995). This trend has influenced higher education in many aspects, one of which is the increasing number of large data sets as secondary data sources for academic research. It has become common for journals in higher education to present studies (e.g., Rosser, 2004; Toutkoushian and Conley, 2005) that were based on a data set of over several thousands of records from sources such as the National Center for Educational Statistics (NCES) and the American Association of Community Colleges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

REFERENCES

  • Cheng, J., and Greiner, R. (1999). Comparing Bayesian network classifiers. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), Sweden, 101–108.

    Google Scholar 

  • Cheng, J., Greiner, R., Kelly, J., Bell, D., and Liu, W. (2001). Learning Bayesian networks from data: An information-theory based approach. Artificially Intelligence, 137(1–2): 43–100.

    Google Scholar 

  • Elder, J.E. and Pregibon, D. (1996). A statistical perspective on knowledge discovery in databases. In U.M. Fayyad, G. Piatetsky-Shapiro, R. Smyth, and R. Uthurusamy (eds.), Advances in Knowledge Discovery and Data Mining (pp. 83–113). Menlo Park, Cali fornia: AAAI Press.

    Google Scholar 

  • Fayyad, U.M. (1997, August). Data mining and knowledge discovery in databases: Implications for scientific databases. Paper presented at 9th International Conference on Scientific and Statistical Database Management (SSDBM’97), Olympia, WA.

    Google Scholar 

  • Frawley W.J., Piatetsky-Shapiro, G., and Matheu, C.J. (1991). Knowledge discovery in database: An overview. In G. Piatetsky-Shapiro and W.J. Frawley (eds.), Knowledge Discovery in Databases (pp. 1–27). Menlo Park, CA: MIT/AAAI Press.

    Google Scholar 

  • Friedman, J.H. (1997). Data mining and statistics: What’s the connection? In D. W Scott (ed.), Computing Science and Statistics: Vol. 29(1). Mining and Modeling Massive Data Sets in Sciences, and Business with a Subtheme in Environmental Statistics (pp. 3–9). Fairfax Station, VA: Interface Foundation of North America, Inc.

    Google Scholar 

  • Friedman, N., Geiger, D. and Goldszmidt, M. (1997). Bayesian Network classifiers. Machine Learning 29(2–3): 131–163.

    Article  Google Scholar 

  • Glymour, C., Madigan, D., Pregibon, D. and Smyth, P. (1997). Statistical themes and lessons for data mining. Data Mining and Knowledge Discovery 1: 11–28.

    Article  Google Scholar 

  • Hand, D.J. (1999). Statistics and data mining: Intersecting disciplines. SIGKDD Exploration 1: 16–19.

    Google Scholar 

  • Hand, D.J. (1998). Data mining: Statistics and more? The American Statistician 52: 112–118.

    Article  Google Scholar 

  • Hand, D.J., Mannila, H., and Smyth, P. (2001). Principles of Data Mining. Cambridge, MA: MIT Press.

    Google Scholar 

  • Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discover 1: 79–119.

    Article  Google Scholar 

  • Hill, C.M., Malone, L.C., and Trocine, L. (2004). Data mining and traditional regression. In H. Bozdogan (ed.), Statistical Data Mining and Knowledge Discovery (pp. 233–250). Boca Raton, Florida: CRC Press LLC.

    Google Scholar 

  • Michalski, R.S., Bratko, I., and Kubat, M. (1998). Machine Learning and Data Mining: Methods and Applications. Chichester: John Wiley & Sons.

    Google Scholar 

  • National Center of Education Statistics [NCES] (2002). National Survey of Postsecondary Faculty 1999. NCES Publication No. 2002151. (Restricted-use data file, CD-ROM). Washington, DC: Author.

    Google Scholar 

  • Niedermayer, D. (1998). An introduction to Bayesian networks and their contem porary applications. Retrieved on July 24, 2005 from http://www.niedermayer.ca/ papers/bayesian/

    Google Scholar 

  • Press, J. (2004). The role of Bayesian and Frequentist multivariate modeling in statistical data mining. In H. Bozdogan (ed.), Statistical Data Mining and Knowledge Discovery (pp. 233–250). Boca Raton, Florida: CRC Press LLC.

    Google Scholar 

  • Raftery, A.E. (1995). Bayesian model selection in social research. Sociological Methodology 25: 111–164.

    Article  Google Scholar 

  • Rosser, VJ. (2004). Faculty members’ intentions to leave: A national study on their worklife and satisfaction. Research in Higher Education 45(3): 285–309.

    Article  Google Scholar 

  • Toutkoushian, R.K. and Conley V.M. (2005). Progress for women in academe, yet in equities persist: Evidence from NSOPF:99. Research in Higher Education 46(1): 1–28.

    Article  Google Scholar 

  • Wegman, E. (1995). Huge data sets and the frontiers of computational feasibility. Journal of Computational and Graphical Statistics 4(4): 281–295.

    Google Scholar 

  • Williamson, L., and Corfield, D. (2001). Introduction: Bayesian into the 21st century. In D. Corfield and J. Willamson (eds.), Foundations of Bayesianism (pp. 1–18). Boston: Kluwer Academic Publishers.

    Google Scholar 

  • Yu, Y., and Johnson, B.W. (2002). Bayesian Belief Network and Its Applications. Tech. Rep. UVA-CSCS-BBN-001. Charlottesville, VA: University of Virginia, Center for Safety-Critical Systems.

    Google Scholar 

  • Zhou, Z. (2003). Three perspectives of data mining. Artificial Intelligence 143(1): 139–146.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this chapter

Cite this chapter

Xu, Y.J. (2006). USING DATA MINING TO ANALYZE LARGE DATA SETS IN HIGHER EDUCATION RESEARCH: AN EXAMPLE OF PREDICTION WITH NSOPF: 99. In: Smart, J.C. (eds) HIGHER EDUCATION:. Higher Education: Handbook of Theory and Research, vol 21. Springer, Dordrecht. https://doi.org/10.1007/1-4020-4512-3_9

Download citation

Publish with us

Policies and ethics