Abstract
This chapter presents an outline of applications to language analysis that open up through the combined use of two simple yet powerful programming languages with particularly short descriptions: sed and awk. We shall demonstrate how these two UNIX1 tools can be used to implement small, useful and customized applications ranging from text-formatting and text-transforming to sophisticated linguistic computing. Thus, the user becomes independent of sometimes bulky software packages which may be difficult to customize for particular purposes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
H. Abramson, S. Bhalla, K.T. Christianson, J.M. Goodwin, J.R. Goodwin, J. Sarraille (1995): Towards CD-ROM based Japanese ? English dictionaries: Justification and some implementation issues. In: Proc. 3rd Natural Language Processing Pacific-Rim Symp. (Dec. 4–6, 1995), Seoul, Korea
H. Abramson, S. Bhalla, K.T. Christianson, J.M. Goodwin, J.R. Goodwin, J. Sarraille, L.M. Schmitt (1996): Multimedia, multilingual hyperdictionaries: A Japanese ? English example. Paper presented at the Joint Int. Conf. Association for Literary and Linguistic Computing and Association for Computers and the Humanities (June 25–29, 1996, Bergen, Norway, available from the authors
H. Abramson, S. Bhalla, K.T. Christianson, J.M. Goodwin, J.R. Goodwin, J. Sarraille, L.M. Schmitt (1996): The Logic of Kanji lookup in a Japanese ? English hyperdictionary. Paper presented at the Joint Int. Conf. Association for Literary and Linguistic Computing and Association for Computers and the Humanities (June 25–29, 1996, Bergen, Norway, available from the authors
A.V. Aho, B.W. Kernighan, P.J. Weinberger (1978): awk — A Pattern Scanning and Processing Language (2nd ed.). In: B.W. Kernighanm, M.D. McIlroy (eds.), UNIX programmer’s manual (7th ed.), Bell Labs, Murray Hill, http://cm.bell-labs.com/7thEdMan/vol2/awk
A.V. Aho, B.W. Kernighan, P.J. Weinberger (1988): The AWK programming language. Addison-Wesley, Reading, MA
B.T.S. Atkins (1992): Acta Linguistica Hungarica 41:5–71
J. Burstein, D. Marcu (2003): Computers and the Humanities 37:455–467
C. Butler (1985): Computers in linguistics. Basil Blackwell, Oxford
K.T. Christianson (1997): IRAL 35:99–113
K. Church (1990): Unix for Poets. Tutorial at 13th Int. Conf. on Computational Linguistics, COLING-90 (August 20–25, 1990), Helsinki, Finland, http://www.ling.lu.se/education/homepages/LIS131/unix-for-poets.pdf
W.F. Clocksin, C.S. Mellish (1981): Programming in Prolog. Springer, Berlin
A. Collier (1993): Issues of large-scale collocational analysis. In: J. Aarts, P. De Haan, and N. Oostdijk (eds.), English language corpora: Design, analysis and exploitation, Editions Rodopi, B.V., Amsterdam
A. Coxhead (2000): TESOL Quarterly 34:213–238
A. Coxhead (2005): Academic word list. Retrieved Nov. 30, 2005, http://www.vuw.ac.nz/lals/research/awl/
A. Fox (1995): Linguistic Reconstruction: An Introduction to Theory and Method. Oxford Univ. Press, Oxford
P.G. Ganssler, W. Stute (1977): Wahrscheinlichkeitstheorie. Springer, Berlin
GNUPLOT 4.0. Gnuplot homepage, http://www.gnuplot.info
J.D. Goldfield (1986): An Approach to Literary Computing in French. In: Méthodes quantitatives et informatiques dans l’étude des textes, Slatkin-Champion, Geneva
M. Gordon (1996): What does a language’s lexicon say about the company it keeps?: A slavic case study. Paper presented at Annual Michigan Linguistics Soc. Meeting (October 1996), Michigan State Univ., East Lansing, MI
W. Greub (1981): Linear Algebra. Springer, Berlin
S. Hockey, J. Martin (1988): The Oxford concordance program: User’s manual (Ver. 2). Oxford Univ. Computing Service, Oxford
M. Hoey (1991): Patterns of lexis in text. Oxford Univ. Press, Oxford
A.G. Hume, M.D. McIlroy (1990): UNIX programmer’s manual (10th ed.). Bell Labs, Murray Hill
K. Hyland (1997): J. Second Language Writing 6:183–205
S.C. Johnson (1978): Yacc: Yet another compiler-compiler. In: B.W. Kernighan, M.D. McIlroy (eds.), UNIX programmer’s manual (7th ed.), Bell Labs, Murray Hill, http://cm.bell-labs.com/7thEdMan/vol2/yacc.bun
G. Kaye (1990): A corpus builder and real-time concordance browser for an IBM PC. In: J. Aarts, W. Meijs (eds.), Theory and practice in corpus linguistics, Editions Rodopi, B.V., Amsterdam
P. Kaszubski (1998): Enhancing a writing textbook: a nationalist perspective. In: S. Granger (ed.), Learner English on Computer, Longman, London
G. Kennedy (1991): Between and through: The company they keep and the functions they serve. In: K. Aijmer, B. Altenberg (eds.), English corpus linguistics, Longman, New York
B.W. Kernighan, M.D. McIlroy (1978): UNIX programmer’s manual (7th ed.). Bell Labs, Murray Hill
B.W. Kernighan, R. Pike (1984): The UNIX programming environment. Prentice Hall, Englewood Cliffs, NJ
B.W. Kernighan, D.M. Ritchie (1988): The C programming language. Prentice Hall, Englewood Cliffs, NJ
G. Kjellmer (1989): Aspects of English collocation. In: W. Meijs (ed.), Corpus linguistics and beyond, Editions Rodopi, B.V., Amsterdam
L. Lamport (1986): Latex — A document preparation system. Addison-Wesley, Reading, MA
M.E. Lesk, E. Schmidt (1978): Lex — A lexical analyzer generator. In: B.W. Kernighan, M.D. McIlroy (eds.), UNIX programmer’s manual (7th ed.), Bell Labs, Murray Hill, http://cm.bell-labs.com/7thEdMan/vol2/lex
N.H. McDonald, L.T. Frase, P. Gingrich, S. Keenan (1988): Educational Psychologist 17:172–179
C.F. Meyer (1994): Studying usage in computer corpora. In: G.D. Little. M. Montgomery (eds.), Centennial usage studies, American Dialect Soc., Jacksonville, FL
A.N. Nelson (1962): The original modern reader’s Japanese-English character dictionary (Classic ed.). Charles E. Tuttle, Rutland
A. Renouf, J.M. Sinclair (1991): Collocational frameworks in English. In: K. Aijmer, B. Altenberg (Eds.) English corpus linguistics, Longman, New York
L.M. Schmitt, K. Christianson (1998): System 26:567–589
L.M. Schmitt, K. Christianson (1998): ERIC: Educational Resources Information Center, Doc. Service, National Lib. Edu., USA, ED 424 729, FL 025 224
F.A. Smadja (1989): Literary and Linguistic Computing 4:163–168
J.M. Swales (1990): Genre Analysis: English in Academic and Research Setting. Cambridge Univ. Press, Cambridge
F. Tuzi (2004): Computers and Composition 21:217–235
L. Wall, R.L. Schwarz (1990): Programming perl. O’Reilly, Sebastopol
C.A. Warden (2000): Language Learning 50:573–616
J.H.M. Webb (1992): 121 common mistakes of Japanese students of English (Revised ed.). The Japan Times, Tokyo
S. Wolfram (1991): Mathematica — A system for doing mathematics by computer (2nd ed.). Addison-Wesley, Reading, MA
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer-Verlag London Limited
About this chapter
Cite this chapter
Schmitt, L.M., Christianson, K., Gupta, R. (2007). Linguistic Computing with UNIX Tools. In: Kao, A., Poteet, S.R. (eds) Natural Language Processing and Text Mining. Springer, London. https://doi.org/10.1007/978-1-84628-754-1_12
Download citation
DOI: https://doi.org/10.1007/978-1-84628-754-1_12
Publisher Name: Springer, London
Print ISBN: 978-1-84628-175-4
Online ISBN: 978-1-84628-754-1
eBook Packages: Computer ScienceComputer Science (R0)