Word-concept clusters in a legal document collection
For very large document collections or high volume streams of documents such as information resources on the web, finding relevant documents is a major information filtering problem. Traditional full text retrieval methods can not locate documents which use specialised synonyms or related concepts to the formal query. This is particularly a problem in legal document collections, since lawyers use normal words with specialised meanings which vary subtly between legal sub-domains. We use a neural network approach to learn synonyms and related clusters of words defining similar concepts from a sample document set. We demonstrate that our clusters of words are qualitatively useful, in the legal domain in particular, and can thus be used for high throughput information filtering to find documents likely to contain concepts relevant to a user's information need.
Unable to display preview. Download preview PDF.
- 1.Blair, D.C. Language and Representation in Information Retrieval, Amsterdam, Elsevier, 1990.Google Scholar
- 2.Bustos, R.A. and Gedeon, T.D. “Learning Synonyms and Related Concepts in Document Collections,” in Alspector, J., Goodman, R. and Brown, T.X. Applications of Neural Networks to Telecommunications 2, pp. 202–209, Lawrence Erlbaum, 1995.Google Scholar
- 3.Gedeon, T.D., Johnson, L. and Mital, V. “Neural Networks for Information Retrieval,” in Mital, V. and Johnson, L. Advanced Information Systems for Lawyers, pp. 268–277, Chapman & Hall, 1992.Google Scholar
- 4.Gedeon, T.D. and Mital, V. “Information Retrieval in Law using a Neural Network Integrated with Hypertext,” Proceedings International Joint Conference on Neural Networks, pp. 1819–1824, Singapore, 1991.Google Scholar
- 5.Gedeon, T.D. and Bustos, R.A. “Word-Concept Clusters in Document Collections,” Proceedings Australian Document Computing Conference, pp. 21–24, Melbourne, 1996.Google Scholar
- 6.Kumar, V.R. and Lindley, C.A. “Improving Decision Support Through Hypermedia”, Proceedings, 3rd ACM Golden-West International Conference on Intelligent Systems, Kluwer Academic Publishers, Las Vegas, 1994.Google Scholar
- 7.Salton, G. The SMART Retrieval System — Experiment in Automatic Document Processing, Englewood Cliffs, Prentice-Hall, 1971.Google Scholar
- 8.Wilks, Y., Guthrie, L., Guthrie, J. and Cowrie, J. “Combining Weak Methods in Large-Scale Text Processing” in Jacobs, P.S. Text-Based Intelligent Systems: Current Research and Practice in Information Extraction and Retrieval, Lawrence Erlbaum Associates, Hillsdale, New Jersey, at pp. 35, 1992.Google Scholar
- 9.Zobel, J., Moffat, A., Wilkinson, R. and Sacks-Davis, R. “Efficient Retrieval of Partial Documents,” Information Processing and Management, vol. 31, no. 3, pp. 361–377, 1995.Google Scholar