Abstract
The recent advancement in computing technology, availability of low-cost storage devices, and popularity of Internet have empowered data acquisition that is substantially different from the traditional approach (Hand, Mannila, and Smyth, 2001;Wegman, 1995). This trend has influenced higher education in many aspects, one of which is the increasing number of large data sets as secondary data sources for academic research. It has become common for journals in higher education to present studies (e.g., Rosser, 2004; Toutkoushian and Conley, 2005) that were based on a data set of over several thousands of records from sources such as the National Center for Educational Statistics (NCES) and the American Association of Community Colleges.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
REFERENCES
Cheng, J., and Greiner, R. (1999). Comparing Bayesian network classifiers. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI), Sweden, 101–108.
Cheng, J., Greiner, R., Kelly, J., Bell, D., and Liu, W. (2001). Learning Bayesian networks from data: An information-theory based approach. Artificially Intelligence, 137(1–2): 43–100.
Elder, J.E. and Pregibon, D. (1996). A statistical perspective on knowledge discovery in databases. In U.M. Fayyad, G. Piatetsky-Shapiro, R. Smyth, and R. Uthurusamy (eds.), Advances in Knowledge Discovery and Data Mining (pp. 83–113). Menlo Park, Cali fornia: AAAI Press.
Fayyad, U.M. (1997, August). Data mining and knowledge discovery in databases: Implications for scientific databases. Paper presented at 9th International Conference on Scientific and Statistical Database Management (SSDBM’97), Olympia, WA.
Frawley W.J., Piatetsky-Shapiro, G., and Matheu, C.J. (1991). Knowledge discovery in database: An overview. In G. Piatetsky-Shapiro and W.J. Frawley (eds.), Knowledge Discovery in Databases (pp. 1–27). Menlo Park, CA: MIT/AAAI Press.
Friedman, J.H. (1997). Data mining and statistics: What’s the connection? In D. W Scott (ed.), Computing Science and Statistics: Vol. 29(1). Mining and Modeling Massive Data Sets in Sciences, and Business with a Subtheme in Environmental Statistics (pp. 3–9). Fairfax Station, VA: Interface Foundation of North America, Inc.
Friedman, N., Geiger, D. and Goldszmidt, M. (1997). Bayesian Network classifiers. Machine Learning 29(2–3): 131–163.
Glymour, C., Madigan, D., Pregibon, D. and Smyth, P. (1997). Statistical themes and lessons for data mining. Data Mining and Knowledge Discovery 1: 11–28.
Hand, D.J. (1999). Statistics and data mining: Intersecting disciplines. SIGKDD Exploration 1: 16–19.
Hand, D.J. (1998). Data mining: Statistics and more? The American Statistician 52: 112–118.
Hand, D.J., Mannila, H., and Smyth, P. (2001). Principles of Data Mining. Cambridge, MA: MIT Press.
Heckerman, D. (1997). Bayesian networks for data mining. Data Mining and Knowledge Discover 1: 79–119.
Hill, C.M., Malone, L.C., and Trocine, L. (2004). Data mining and traditional regression. In H. Bozdogan (ed.), Statistical Data Mining and Knowledge Discovery (pp. 233–250). Boca Raton, Florida: CRC Press LLC.
Michalski, R.S., Bratko, I., and Kubat, M. (1998). Machine Learning and Data Mining: Methods and Applications. Chichester: John Wiley & Sons.
National Center of Education Statistics [NCES] (2002). National Survey of Postsecondary Faculty 1999. NCES Publication No. 2002151. (Restricted-use data file, CD-ROM). Washington, DC: Author.
Niedermayer, D. (1998). An introduction to Bayesian networks and their contem porary applications. Retrieved on July 24, 2005 from http://www.niedermayer.ca/ papers/bayesian/
Press, J. (2004). The role of Bayesian and Frequentist multivariate modeling in statistical data mining. In H. Bozdogan (ed.), Statistical Data Mining and Knowledge Discovery (pp. 233–250). Boca Raton, Florida: CRC Press LLC.
Raftery, A.E. (1995). Bayesian model selection in social research. Sociological Methodology 25: 111–164.
Rosser, VJ. (2004). Faculty members’ intentions to leave: A national study on their worklife and satisfaction. Research in Higher Education 45(3): 285–309.
Toutkoushian, R.K. and Conley V.M. (2005). Progress for women in academe, yet in equities persist: Evidence from NSOPF:99. Research in Higher Education 46(1): 1–28.
Wegman, E. (1995). Huge data sets and the frontiers of computational feasibility. Journal of Computational and Graphical Statistics 4(4): 281–295.
Williamson, L., and Corfield, D. (2001). Introduction: Bayesian into the 21st century. In D. Corfield and J. Willamson (eds.), Foundations of Bayesianism (pp. 1–18). Boston: Kluwer Academic Publishers.
Yu, Y., and Johnson, B.W. (2002). Bayesian Belief Network and Its Applications. Tech. Rep. UVA-CSCS-BBN-001. Charlottesville, VA: University of Virginia, Center for Safety-Critical Systems.
Zhou, Z. (2003). Three perspectives of data mining. Artificial Intelligence 143(1): 139–146.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer
About this chapter
Cite this chapter
Xu, Y.J. (2006). USING DATA MINING TO ANALYZE LARGE DATA SETS IN HIGHER EDUCATION RESEARCH: AN EXAMPLE OF PREDICTION WITH NSOPF: 99. In: Smart, J.C. (eds) HIGHER EDUCATION:. Higher Education: Handbook of Theory and Research, vol 21. Springer, Dordrecht. https://doi.org/10.1007/1-4020-4512-3_9
Download citation
DOI: https://doi.org/10.1007/1-4020-4512-3_9
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-4509-7
Online ISBN: 978-1-4020-4512-7
eBook Packages: Humanities, Social Sciences and LawEducation (R0)