Abstract
Many kinds of texts are currently available in machine-readable form and are amenable to automatic processing. Because the available databases are large and cover many different subject areas, automatic aids must be provided to users interested in accessing the data. It has been suggested that links be placed between related pieces of text, connecting, for example, particular text paragraphs to other paragraphs covering related subject matter. Such a linked text structure, often called hypertext, makes it possible for the reader to start with particular text passages and use the linked structure to find related text elements [4, 19, 5, 12]. Unfortunately, until now, viable methods for automatically building large hypertext structures and for using such structures in a sophisticated way have not been available. Here we give methods for constructing text relation maps and for using text relations to access and use text databases. In particular, we outline procedures for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content.
1 Vast amounts of text material are now available in machine-readable form for automatic processing. Here, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs. In particular, methods are given for determining text themes, traversing texts selectively, and extracting summary statements reflecting text content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
S. Al-Hawamdeh and P. Willett, (1989). Paragraph-Based Near-Neighbor Searching in Full Text Documents. Electronic Publishing 2, 179-.
M.H. Anderson, J. Nielsen and H. Rasmussen, (1989). A Similarity-Based Hypertext Browser for Reading the UNIX Network News, Hypermedia, 1, 255–265.
M. Bernstein, (1990). An Apprentice that Discovers Hypertext Links. In A. Rizk, N. Streitz, and J. Andre, (Eds.) Proc. European Conference on Hypertext (ECHT), 212–223.
M. Bernstein, J.D. Bolter, M. Joyce, and E. Mylonas, (1991). Architectures for Volatile Hypertext. In Proc. Hypertext-91, Association for Computing Machinery, New York, 246–260.
J.D. Bolter, (1991). Writing Space — The Computer, Hypertext, and the History of Writing. L. Erlbaum Associates, Hillsdale, N.J.
R.A. Botafogo, E. Rivlin and B. Shneiderman, (1992). Structural Analysis of Hypertexts: Identifying Hierarchies and Useful Metrics, ACM Transactions on Information Systems, 10, 142–180.
C. Buckley, G. Salton, and J. Allan, (1993). Automatic Retrieval with Locality Information Using SMART, In D.K. Harman (Ed.), The First Text REtrieval Conference (TREC-1). National Institute of Standards and Technology Special Publication 500–207, Gaithersburg, Md. 20899, 59–72.
C. Buckley, J. Allan, and G. Salton, (1994). Automatic Routing and Ad-hoc Retrieval Using SMART: TREC-2, In D.K. Harman (Ed.), The Second Text REtrieval Conference (TREC-2), National Institute of Standards and Technology Special Publication 500–215, Gaithersburg, Md. 20899, 45–55.
M.H. Chignell, B. Nordhausen, J.F. Valdez, and J.A. Waterworth, (1991). The HEFTI Model of Text to Hypertext Conversion. Hypermedia 3, 187-.
W.B. Croft, (1977). Clustering Large Files of Documents Using the Single Link Method. Journal of the American Society for Information Science 28, 341-.
G. de Jong, (1982) in Strategies for Natural Language Processing, W.G. Lehnert and M.H. Ringle, (Eds.), L. Erlbaum Associates, Hillsdale, N.J., 149–176.
P. Delaney and G.P. Landow, Eds., (1991). Hypermedia and Literary Studies. MIT Press, Cambridge, MA (USA).
H.P. Edmundson and R.E. Wyllys, (1961). Automatic Abstracting and Indexing, Survey and Recommendations. Communications of the ACM 4, 226-.
R. Furuta, C. Pleasant and B. Shneiderman, (1989). A Spectrum of Automatic Hypertext Constructions. Hypermedia 1, 179-.
R.S. Gilyarevskii and M.M. Subbotin, (1993). Journal of the American Society for Information Science 44, 185-.
P. Gloor, (1991). Cybermap: Yet Another Way of Navigating Hyperspace. In Proc. Hypertext-91, 107–121.
C. Guinan and A.F. Smeaton, (1992). Information Retrieval from Hypertext Using Dynamically Planned Guided Tours. In. Proc. ECHT-92 -European Conference on Hypertext, 122–130.
M.A. Hearst and C. Plaunt, (1993). Subtopic Structuring for Full-Length Document Access. In R. Khorfage, E. Rasmussen and P. Willett (Eds.), Proc. 16th ACM-SIGIR Conference, Pittsburgh (USA), 55–68.
G.P. Landow, (1989). Hypertext in Literary Education. Computers and the Humanities, 23, 173-.
H.P. Luhn, (1958). The Automatic Creation of Literature Abstracts. IBM Journal of Research and Development 2, 159-.
F. Murtagh, (1982). A Survey of Recent Advances in Hierarchical Clustering Algorithms. The Computer Journal 26, 354-.
J. O’Connor, (1975). Retrieval of Answer Sentences and Answer Figures by Text Searching. Information Processing and Management, 11, 155-.
J. O’Connor, (1980). Answer Passage Retrieval by Text Searching. Journal of the American Society for Information Science 32, 227-.
CD. Paice, (1990). Constructing Literature Abstracts by Computer. Information Processing and Management, 26, 171-.
T.C. Rearick, (1991). In Hypertext/Hypermedia Handbook, J. Devlin and E. Berk, (Eds.) McGraw Hill, New York, 113–140.
J.E. Rush, R. Salvador, and A. Zamora, (1964). Automatic Abstracting and Indexing — Production of Indicative Abstracts by Application of Contextual Inference and Syntactic Coherence Criteria. Journal of the American Society for Information Science 22, 260-.
G. Salton, Ed., (1971). The Smart Retrieval System — Experiments in Automatic Document Processing. Prentice Hall, Englewood Cliffs, N.J.
G. Salton, C.S. Yang, and A. Wong, (1975). A Vector Space Model for Automatic Indexing. Communications of the ACM 18, 613-.
G. Salton and A. Wong, (1978). Generation and Search of Clustered Files. ACM Transactions on Database Systems 3, 321-.
G. Salton, (1981). Automatic Text Processing — The Transformation, Analysis, and Retrieval of Information by Computer. Addison Wesley, Reading, MA.
G. Salton, (1991). Developments in Automatic Text Retrieval. Science 253, 974-.
G. Salton and C. Buckley, (1991). Global Text Matching for Information Retrieval. Science 253, 1012-.
Salton G. and Buckley, C, (1991). Automatic Text Structuring and Retrieval: Experiments in Automatic Encyclopedia Searching. In A. Bookstein, Y. Chiaramella, G. Salton and V.V. Raghavan (Eds.), Proc. 14th ACM-SIGIR Conference, Chicago (USA), 21–30.
G. Salton, C. Buckley and J. Allan, (1992). Automatic Structuring of Text Files. Electronic Publishing 5, 1-.
G. Salton, J. Allan, and C. Buckley, (1993). Approaches to Passage Retrieval in Full Text Information Systems. In R. Khorfage, E. Rasmussen and P. Willett (Eds.), Proc. 16th ACM-SIGIR Conference, Pittsburgh (USA), 49–58.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1996 Kluwer Academic Publishers
About this chapter
Cite this chapter
Salton, G., Allan, J., Buckley, C., Singhal, A. (1996). Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts. In: Agosti, M., Smeaton, A.F. (eds) Information Retrieval and Hypertext. Information Retrieval and Hypertext. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1373-1_3
Download citation
DOI: https://doi.org/10.1007/978-1-4613-1373-1_3
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8593-9
Online ISBN: 978-1-4613-1373-1
eBook Packages: Springer Book Archive