3  Working with files

Note⏳ Time
  • Teaching: 15 min
  • Exercises: 5 min
Note🤔 Questions
  • How do I create, copy, move and delete files?
  • How do I look at the contents of a file without opening it in a text editor?
Note🎯 Objectives
  • Create empty files with touch
  • Copy and move files with cp and mv
  • Delete files safely with rm
  • Inspect file contents with cat, head, tail, and less

3.1 Touching (Creating an empty file)

The command touch creates an empty file and also updates the timestamp of an existing file without changing its contents:

cd ~/training
touch notes.txt
ls
Note

Empty files might seem useless, but touch is handy for quickly creating placeholder files, or for testing commands before applying them to real data.


3.2 Copying (Making a duplicate)

The command cp (copy) duplicates a file. The syntax is:

cp source destination

For example:

cp notes.txt notes_backup.txt
ls

To copy a whole directory and everything inside it, add the -r flag (recursive):

cp -r data data_backup

3.3 Moving (Relocating or renaming)

The command mv (move) moves a file to a new location and also serves as the way to rename files in Unix, since there is no separate rename command:

# rename a file
mv notes.txt readme.txt

# move it into a subdirectory
mv readme.txt data/readme.txt
Warning

mv will silently overwrite the destination if a file with that name already exists there. Always double-check before moving.


3.4 Removing (Deleting permanently)

The command rm (remove) deletes files. Unlike moving something to the Trash, there is no undo:

rm notes_backup.txt

To remove a directory and all its contents:

rm -r data_backup
Warning

Be very careful with rm -r. A common safety habit is to use rm -ri instead, which asks for confirmation before deleting each item:

rm -ri data_backup

3.5 Concatenating (Printing a whole file)

The command cat (catenate) prints the entire contents of a file to the terminal:

cat data/readme.txt

cat is fast and simple, but prints everything at once not ideal for large files.


3.6 Head and tail (Peeking at the beginning or end)

head shows the first 10 lines of a file by default:

head data/readme.txt

Use -n to choose a different number of lines:

head -n 5 data/readme.txt

tail shows the last 10 lines:

tail data/readme.txt

These are especially useful for large biological data files you can check the format without loading the whole thing.


3.7 Less (Scrolling through a large file)

For files too large to comfortably cat, less opens them in an interactive scrollable viewer:

less data/readme.txt
Key Action
Space Page down
b Page up
/ One line at a time
/pattern Search for pattern
n Next search result
q Quit
Tip

You will use less constantly when working with FASTQ, FASTA, and other large genomics files. It never loads the whole file into memory, so it works even on files that are gigabytes in size.


Caution✏️ Exercise 3.1
  1. Navigate to ~/training
  2. Create a file called experiment.txt inside the data/ directory
  3. Make a copy of it called experiment_v2.txt in the same directory
  4. Rename experiment.txt to experiment_v1.txt
  5. List the contents of data/ to confirm
cd ~/training
touch data/experiment.txt
cp data/experiment.txt data/experiment_v2.txt
mv data/experiment.txt data/experiment_v1.txt
ls data/
Important🚀 Bonus: for those who want more

The command wc (word count) counts lines, words and characters in a file:

wc data/readme.txt

The output format is: lines words characters filename

Use wc -l to count only lines. We will use this in Chapter 5 to count sequences in a FASTA file.


Tip🔑 Key points
  • touch creates an empty file; cp copies; mv moves or renames; rm deletes permanently there is no undo
  • cat prints a whole file; head and tail show the beginning and end; less lets you scroll through large files
  • Always double-check before using rm -r