Abstract
One goal of the Computing and Information Technology Interactive Digital Educational Library (CITIDEL) is to maximize the number of computing-related resources available to computer science scholars and practitioners through it. In this paper, we describe a set of experiments designed to help this goal by adding to CITIDEL a sub-collection of computing related electronic theses and dissertations (ETDs) automatically extracted from the Networked Digital Library of Theses and Dissertations (NDLTD) OAI Union Catalog. We analyze the metadata quality of the NDLTD OAI Union Catalog and describe three different experiments that combine different sources of evidence to improve the accuracy in filtering out the computing related entries.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Fox, E.A.: Computing and Information Technology Interactive Digital Educational Library (CITIDEL). Homepage (2002), http://www.citidel.org/
Fox, E.A.: Networked Digital Library of Theses and Dissertations. Nature Web Matters (1999), http://helix.nature.com/webmatters/library/library.html
Fox, E.A.: Networked Digital Library of Theses and Dissertations (NDLTD). Homepage (1999), http://www.ndltd.org
Suleman, H., Atkins A., Gonçalves, M.A., France, R.K., Fox, E.A., Virginia Tech; Chachra V., Crowder M., VTLS, Inc.; and Young, J., OCLC: Networked Digital Library of Theses and Dissertations: Bridging the Gaps for Global Access - Part 1: Mission and Progress. D-Lib Magazine 7(9) (2001)
Suleman, H., Luo, M.: Electronic Thesis/Dissertation OAI Union Catalog. Homepage (2002), http://rocky.dlib.vt.edu/~etdunion/cgi-bin/index.pl
Van de Sompel, H.: Open Archives Initiative. WWW site. Cornell University, Ithaca (2000), http://www.openarchives.org
DCMI: Dublin Core Metadata Element Set, Version 1.1: Reference Description, Available from http://www.dublincore.org/documents/dces/
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco (2000)
Paynter, G.: Attribute-Relation File Format (ARFF). WWW site, http://www.cs.waikato.ac.nz/~ml/weka/arff.html
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Schölkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning, MIT Press, Cambridge (1998)
Joachims, T.: Text categorization with support vector machines: learning with many relevant features. In: Nédellec, C., Rouveirol, C. (eds.) ECML 1998. LNCS, vol. 1398, pp. 137–142. Springer, Heidelberg (1998)
Dumais, S.T., Platt, J., Heckerman, D., Sahami, M.: Inductive learning algorithms and representations for text categorization. In: Proceedings of CIKM 1998, 7th ACM International Conference on Information and Knowledge Management, Bethesda, MD, pp. 148–155 (1998)
Joachims, T.: A statistical learning model of text classification for support vector machines. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, pp. 128–136 (2001)
Dumais, S., Chen, H.: Hierarchical classification of Web content. In: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, Athens, Greece, pp. 256–263 (2000)
Sebastiani, F.: Machine learning in automated text categorization. ACM Computing Surveys (CSUR) 34(1), 1–47 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, B., Gonçalves, M.A., Fox, E.A. (2003). An OAI-Based Filtering Service for CITIDEL from NDLTD. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_61
Download citation
DOI: https://doi.org/10.1007/978-3-540-24594-0_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20608-8
Online ISBN: 978-3-540-24594-0
eBook Packages: Springer Book Archive