Abstract
The distinction is drawn between assigned and derived indexing. The problem of evaluating indexes is addressed. It is observed that certain properties of the indexing graph, and the act of indexing, can be related to desirable retrieval properties such as precision and recall. The capabilities of humans and computers at the task of indexing are outlined and compared. The common view that there are, and can be, no theories of indexing and that indexing cannot be taught is discussed. The phenomenon of generalized synonymy—that we can make indefinitely many paraphrases of a text—is introduced and derived concept indexing is suggested as the appropriate response to this. The phenomenon of generalized homography—that any text under-determines its conceptual content—is introduced. There is discussion of algorithmic, or automatic, approaches to the indexing subtasks of producing the indexing graph or schedule, annotation, and clustering.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Albrechtsen H (1993) Subject analysis and indexing: from automated indexing to domain analysis. Indexer 18(4):219–224
Anderson JD (1985) Indexing systems: extensions of the mind’s organizing power. In: Ruben BD (ed) Information and behaviour. Transaction Books, New Brunswick, pp 287–323
Anderson JD, Pérez-Carballo J (2001a) The nature of indexing how humans and machines analyze messages and texts for retrieval Part I Research, and the nature of human indexing. Inf Process Manage 37(2):231–254
Anderson JD, Pérez-Carballo J (2001b) b The nature of indexing: how humans and machines analyze messages and texts for retrieval Part II Machine indexing, and the allocation of human versus machine effort. Inf Process Manage 37(2):255–277
Bates MJ (1998) Indexing and access for digital libraries and the Internet: human, database, and domain factors. J Am Soc Inf Sci 49:1185–1205
Borko H (1977) Toward a theory of indexing. Inf Process Manage 13(6):355–365
Brenner EH (1989) Vocabulary control. In: Weinberg BH (ed.) Indexing: the state of our knowledge and the state of our ignorance: Proceedings of the 20th Annual Meeting of the American Society of Indexers, 1988. Medford, NJ, pp. 62–67. (Learned Information)
Buchanan B (1979) Theory of library classification. Clive Bingley, London
Buckland M (1999) Vocabulary as a central concept in library and information science. In: Aparac T, Saracevic T, Ingwersen P, Vakkari P (eds) Digital libraries: interdisciplinary concepts, challenges, and opportunities. Proceedings of the third international conference on conceptions of library and information science, CoLIS3, Dubrovnik, Croatia, 23–26 May 1999. Lokve, Zagreb, Croatia, pp 3–12
Buckland M, Plaunt C (1994) On the construction of selection systems. Library Hi Tech 12(4):15–28
Candan KS, Di Caro L, Sapino ML (2008) Creating tag hierarchies for effective navigation in social media. In: ACM workshop on search in social media, Napa Valley, CA. ACM, pp 75–82
Chan LM (1989) Inter-indexer consistency in subject cataloging. Inf Technol Libraries 8(4):349–357
Chen C (2003) Mapping scientific frontiers: the quest for knowledge visualization. Springer, Berlin
Chen H, Yim T, Fye D, Schatz B (1995) Automatic thesaurus generation for an electronic community system. J Am Soc Inf Sci 46(3):175–193. doi:10.1002/(sici)1097-4571(199504)46:3<175::aid-asi3>3.0.co;2-u
Cimiano P, Staab S, Tane J (2003) Automatic acquisition of taxonomies from text: FCA meets NLP. In: ECML/PKDD workshop on adaptive text extraction and mining, 2003
Cooper WS (1969) Is interindexer consistency a hobgoblin? Am Documentation 20(3):268–278
Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41(6):391–407
Dumais ST (1995) Using LSI for information filtering: TREC-3 experiments. In: Harman D (ed) The third text REtrieval conference (TREC3). National Institute of Standards and Technology Special Publication
Farrow J (1995) All in the mind: concept analysis in indexing. Indexer 19(4):243–247
Foskett AC (1977) Subject approach to information, 3rd edn. Clive Bingley, London
Frohmann B (1990) Rules of indexing: a critique of mentalism in information retrieval theory. J Documentation 46:94
Fugmann R (1993) Subject analysis and indexing. Theoretical foundation and practical advice. Indeks Verlag, Frankfurt/Main
Fugmann R (2004) Learning the lessons of the past. In: Rayward WB, Bowden ME (eds) The history and heritage of scientific and technical information systems: proceedings of the 2002 conference, Chemical Heritage Foundation. Information Today, Medford, NJ, pp 168–181
Golub K (2006) Automated subject classification of textual web documents. J Documentation 62(3):350–371
Hersh WR, Hickam D (1995) Information retrieval in medicine: the SAPHIRE experience. J Am Soc Inf Sci 46:743–747. (Letter by S.M. Humphrey. (1996). JASIS, 47:407–408)
Hjørland B (2002) The methodology of constructing classification schemes: a discussion of the state-of-art. In: López-Huertas MJ (ed) Challenges in Knowledge Representation and Organization for the 21th Century. Integration of Knowledge across Boundaries, Granada, Spain, Ergon Verlag, Wu rzburg. Proceedings of the seventh international ISKO conference, pp 450–456
Ibekwe-SanJuan F (2006) Constructing and maintaining knowledge organization tools: a symbolic approach. J Documentation 62(2):229–250
Ingwersen P, Wormell I (1992) Ranganathan in the perspective of advanced information retrieval. Libri 42(3):184–201
Kohonen T (2001) Self-organizing maps, 3rd edn. Springer-Verlag, Berlin
Koll MB (1979) WEIRD: An approach to concept-based information retrieval. ACM SIGIR Forum, XIII 32–50
Moens M-F, Dumortier J (2000) Use of a text grammar for generating highlight abstracts of magazine articles. J Documentation 56(5):520–539
Mulvany NC (1994) Indexing books. University of Chicago Press, Chicago
Olson HA, Boll JJ (2001) Subject analysis in online catalogs, 2nd edn. Libraries Unlimited, Englewood
Quinn BA (1994) Recent theoretical approaches in classification and indexing. Knowl Organiz 21(3):140–147
Rogers FB (1960a) Medical subject headings. Preface and introduction. In. U.S. Department of Health, Education, and Welfare, Washington D.C., pp i–xix
Rogers FB (1960b) Review of Taube, Mortimer. Studies in coordinate indexing. Bull Med Libr Assoc 42:380–384 (July 1954)
Salton G (1968) Automatic information organization and retrieval. McGraw Hill, NY
Salton G (1975) A theory of indexing. Regional conference series in applied mathematics, society for industrial and applied mathematics. Philadelphia, PA
Slavic A, Cordeiro MI (2004) Core requirements for automation of analytico-synthetic classifications. Adv Knowledge Organiz 9:187–192
Soergel D (1974a) Automatic and semi-automatic methods as an aid in the construction of indexing languages and thesauri. Int Classification 1(1):34–39
Soergel D (1974b) Indexing languages and thesauri: construction and maintenance. Melville, Los Angeles
Spårck Jones K (1971) Automatic keyword classification. Butterworths, London
Spårck Jones K (1974a) Automatic indexing. J Documentation 30(4):393–432
Spårck Jones K (1974b) Automatic indexing 1974 computer laboratory. University of Cambridge, Cambridge
Spårck Jones K (1976) Automatic classification. In: Maltby A (ed) Classification in the 1970s: a second look. Bingley, London, pp 209–225
Srinivasan P (1992) Thesaurus construction. In: Frakes WBaB-Y R (ed) Information retrieval: data structures and algorithms. Prentice Hall, Upper Saddle River New Jersey, pp 161–218
Telcordia T (2011) Telcordia Latent Semantic Indexing (LSI) Demo Machine. http://lsi.research.telcordia.com/. Accessed 10 Oct 2010
Weinberg BH (1981) Word frequency and automatic indexing (dissertation). Columbia University, New York
Weinberg BH (1996) Compexity In Indexing Systems - Abandonment And Failure: Implications For Organizing The Internet. http://www.asis.org/annual-96/ElectronicProceedings/weinberg.html. Accessed 12 Oct 2010
Weinberg BH (2009) Indexing: history and theory. Encyclopedia of library and information sciences, 3rd edn. pp 2277–2290
Wellisch HH (1991) Indexing from A to Z. H.W.Wilson Co
Wolfram D, Olson HA (2007) A method for comparing large scale inter-indexer consistency using IR modeling. In: Canadian association for information science proceedings
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media New York
About this chapter
Cite this chapter
Frické, M. (2012). Indexing/Annotation. In: Logic and the Organization of Information. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-3088-9_7
Download citation
DOI: https://doi.org/10.1007/978-1-4614-3088-9_7
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-3087-2
Online ISBN: 978-1-4614-3088-9
eBook Packages: Computer ScienceComputer Science (R0)