Skip to main content

Command-Line Tools in Linux for Handling Large Data Files

  • Chapter
  • First Online:
Bioinformatics: Sequences, Structures, Phylogeny

Abstract

Linux operating system is a freely available version of Unix which is mostly used as a command-line interface. Linux is frequently used for processing large data files for various types, and most of the software used in the field of genomics, proteomics, and bioinformatics are developed to work on it. This chapter explains the hierarchical structure of Linux operating system along with file types and commands used for file/process handling. Moreover, one of the most common text editors, Vi/Vim, used in Linux has been described. There are multiple modes of operation in the Vi editor and most of them are explained in detail. Vi is used for editing the files or writing codes in a programming language. Additionally, to edit the files directly from the terminal, multiple command-line options are available. One of these command-line tools named Awk, which in itself is interpreted programming language generally used for data/text processing is discussed, with examples of manipulating common data files as well as sequence data files.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Acknowledgement

We would like to thank our teacher, Dr. Asheesh Shanker, for giving us the opportunity to contribute towards this book and our friend Dr. Vasantika Singh for her helpful suggestions.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mishra, D., Khandelwal, G. (2018). Command-Line Tools in Linux for Handling Large Data Files. In: Shanker, A. (eds) Bioinformatics: Sequences, Structures, Phylogeny . Springer, Singapore. https://doi.org/10.1007/978-981-13-1562-6_17

Download citation

Publish with us

Policies and ethics