As using Bash commands is inevitable if you work on NLP and MT tasks, I thought it would be useful to list the majority of commands I learnt to use on a daily base, thanks to practice, searching, and helpful colleagues I met over years. Obviously, this is not an exclusive list; however, I hope it includes most of the one-line Bash commands you would need. Please note the majority of these commands have been mainly tested on Linux.
File Management
Fundamentals
Open a directory:cd <path/dir_name>
List the files and sub-directories in the current directory:ls
Create a new directory:mkdir <dir_name>
Rename or move a file or directory:mv <old_filename> <new_filename>
Move a file to a directory:mv <old_filename> <dir_name
>
Move all files whose name starting with a string, using *
:mv <old_filename>* <folder_name>
Rename multiple files: (details)rename 's/<original_string>/<new_string>/g' *
Delete a file:rm <file_name>
To delete multiple files, just add them after the rm
command separated by spaces:rm <file_name1> <file_name2> <file_name3>
Delete any file that starts with “wow”, using *
:rm wow*
Delete a directory and its contents:rm -r <dir_name>
Avoid deleting files by mistake by using trash
instead of rm
, installing trash-cli
:sudo apt-get install trash-cli
• Delete:trash <file_name>
• List trashed items:trash-list
• Restore a file (first move to the root folder or a specific folder):restore-trash
and then type a number.
• Empty the trash list:trash-empty
Copy a file:cp <original_filename> <new_filename>
Copy a directory and its contained files (at least -r
is required):cp -avr <original_dirname> <new_dirname>
Create a new text file:nano <file_name>
Complete a command or file name (e.g. my_file_name.txt):
Type my
and then press Tab
– once if there is no other file starting with “my”.
OR
Type my
and then press Tab
– twice if you want to know what files starting with “my”.
Move to a location in a command or text:
Move the cursor to the location, press Alt or Option, and click.
Clear the current window:
Type clear
OR
Press Ctrl+l
End the current command (before it finishes):
Press Ctrl+c
Advanced
Move to the last accessed path:cd -
List the *.txt files in the current directory (or path):ls *.txt
Show the files in all folders that starts with “aaa”:ls aaa*
Show files and subdirectories in all directories in the current directory:ls *
List all the files with details:ls -l
Display file details:ls -l <file_name>
List all the files with details, the size is in MB/GB:ls -lh
List all the files with details, the size in MB/GB, arrange by time ascendingly:ls -lht
ls -lht <dir_name1>/*/<dir_name2>
List all the files with details, the size in MB/GB, arrange by time ascendingly:ls -lhtr
List file sizes only for all files in the current directory:ls -hs
ORdu
Display the file size only:ls -hs <file_name>
ORdu <file_name>
Display sizes of the current directory:du -d 1 -h .
Sort the results in ascending order:du -d 1 -h . | sort -h
Sort the results in ascending order:du -d 1 -h . | sort -h -r
Find files the are bigger than 200MB:find /home/$USER/ -type f -size +200000k -exec ls -lh {} \; | awk '{ print $5 " --> " $9 }'
Display file size with stat (Linux):stat --printf="%s" <file_name>
Display file last edited time (Linux):stat -c %y <file_name>
Display file last edited time (Mac):stat -x <file_name>
Get the current path (print working directory):pwd
Create a symbolic link, i.e. a shortcut to a file or directory:ln -s <file_name> <shortcut_name>
Get the path of a file:readlink -f <file_name>
ORecho "$(pwd)/file_name"
ORrealpath <file_name>
Get word count in a file:wc <file_name>
Get the number of lines in a file:wc -l <file_name>
Count lines of all file in subdirectories; use *
if the file name is partial (details):find ./ -type f -name "*<file_name>*" -exec wc -l {} +
Count lines in a*. gz file, use -c
to avoid writing the uncompressed file to desk:gunzip -c <file_name.gz> | wc -l
Split a file into multiple files, 3000 lines each: split -l 3000 <file_name> <prefix>
Find out if two files are identical:cmp --silent first_file_name second_file_name || echo "——> Files are different."
Find out the difference between two files:diff <file_name1> <file_name2>
Complete a long command in a new line:\
Reading files
Read the whole file:cat <file_name>
Read the whole file; display line numbers:cat -n <file_name>
Merge two files, use >
to create the output file:cat <file_name1> <file_name2> > <output_file>
Merge all the files that ends with (say “.en”) to a file (e.g. “all.en”):cat *.en > all.en
Read the first 10 lines of a file:head <file_name>
Read the first 4 lines of a file:head -4 <file_name>
ORhead -n 4 <file_name>
Read the first 3 lines of two files:head -q -n <file_name1> <file_name2>
Read the last 10 lines of a file:tail <file_name>
Read the last 3 lines of a file:tail -3 <file_name>
ORtail -n 3 <file_name>
Read a specific line of a file, e.g. line #10:sed -n 10p <file_name>
Read the end of the file and use -f
to update the output:tail -f <file_name>
Use Ctrl+C to exit.
Read a file in chunks:less <file_name>
Press Enter to move to the next chunk of the file, and “q” to quick.
Read a file in chunks, display line numbers:less -N <file_name>
Merge two text files:cat <file_name1> <file_name2> | tee <output_file_name>
Note they must end with a new line to avoid spoiling the last line.
Merge all the files in the current folder:cat * | tee <output_file_name>
Disable sending to stdout (i.e. printing in Terminal) by adding 1> /dev/nullcat <file_name1> <file_name2> | tee <output_file_name>
1> /dev/null
Remove duplicates from a filesort -u input.txt > output.txt
Finding
Find a file that includes a word (e.g. “really great” in *.txt files):grep "really great" *.txt
Search sub-directories recursively using grep
:grep -r <word_to_search> *
ORgrep -R <word_to_search> *
Use regular expressions with grep, e.g. the only word in the line is ‘nan’:grep ^nan$ <file_name>
Find a file on the machine by name:sudo find / -name <file_name>
Find all files in directory and subdirectories that end with *.en:find "$PWD" -type f | grep '\.en$'
Find all files in directory and subdirectories that has ‘aaa’ followed with some text:find "$PWD" -type f | grep "aaa*"
Find files in the current directory that either whose name or content includes “wonderful”:
ls | grep “wonderful”
If you have very long list generated by ls
and want to display them page by page:ls | less
List files whose names include a range of numbers:ls model.0{1..3}*
List files whose names include different letters:ls model.{a,b,c,d}
Find installed Python3 packages:pip3 freeze
Find installed Python3 packages that start with “tensor”, use -i
to ignore case:pip3 freeze | grep -i tensor
Find the location of a command (e.g python3):which python3
Downloading
Download a file using curl
:curl <http://some.url> --output <file_name>
If this is the first time to use curl
, you might get a message like “Command ‘curl’ not found, but can be installed with:sudo apt install curl
Download a file that requires cookies:curl --cookie <cookies.txt> <http://some.url> --output <file_name>
To get the “cookies.txt” file, you can use a Chrome extension like “cookies.txt” to export cookies into a TXT file.
Copy GitHub repository to the machine:git clone https://github.com/USERNAME/REPOSITORYNAME
Update a downloaded GitHub repository:cd <repository_dir_name>
git pull
git checkout master
The default branch is usually called master
– if it is not, replace it with the right name.
Compressing and Extracting
Extract a *.zip file: unzip <file_name>
Create a zip archive from file(s):zip <archive_filename> <file_list>
Create a zip archive from a directory:zip -r <archive_filename.zip> <dir_name>
Extract a *.gz file:gunzip <file_name.gz>
Compress all the files in the same folder as *.gzip:cd <dir_name>
gzip *
Compress all the files in the same directory even if there are subdirectories:cd <dir_name>
gzip -r .
Extract a *.tar.gz file:tar xzvf <file_name.tar.gz>
Extract a *.tgz file:tar xzvf <file_name.tgz>
Extract in a different directory:tar xzvf <file_name.tgz> -C </path/dir_name>
ORgunzip -c <file_name.tgz> | tar xvf -
Create a *.tar archive:tar -czvf archive.tar.gz <dir_name>
Create a *.tar archive from multiple files/directories:tar -czvf <archive_file_name.tar.gz> <file_name1> <file_name2>
Compress as *.bz2 (higher compression):tar -jcvf <archive_name.tar.bz2> <file_dir_name>
Extract a *.bz2 archive:tar -jxvf <archive_name.tar.bz2>
Nano Editor
Create a new file:nano <new_file_name>
Open an existing file:nano <file_name>
Open multiple files:nano <file_name1> <file_name2>
Search the current file:Ctrl+w
Move to the end of the file:Ctrl+w
and then Ctrl+v
Move to the end of the line:Ctrl+e
Move to the start of the line:Ctrl+a
Move a page down:Ctrl+v
Move a page up:Ctrl+y
Mark text:Ctrl+Shift+6
(i.e. it is Ctrl+^
) and then move in the direction to you need.
Cut the marked text:Ctrl+k
Paste the cut text:Ctrl+u
Note to be able to pate across multiple files, the second file must be open first open the two files, copy/cut from the first file, close it, and then paste to the second file.
Close the current file:Ctrl+x
You will be prompted if you want to save; type “y” for yes and “n” for no. If you select to save, just press Enter to keep the current file name. You can also move between two open files as in the next command.
Move between two open files:alt+.
to move forward one file.alt+,
move backward one file.
Note that if you are on Mac, Option+.
and Option+,
are used to insert ≥≤ symbols, so you need to first press Alt+Command+O
to change the behaviour of Option
in Terminal.
Advanced Commands
Find out the server date and time:date
Measure time taken to run a script or command:time <python3 script.py
>
Find out the space on the desk:df -h
Check free memory (Linux):free -m
Create an alias for a command: (details)alias <command>
To save aliases, put this in ~/.bash_aliases
nano ~/.bash_aliases
For example, if you add this command, use quotes for multi-word commands:alias frz="pip3 freeze"
… save, and restart your Terminal (or run exec bash
), the next time you type frz
in the Terminal, it will run the command pip3 freeze
Avoid ending a command if the local Terminal is closed (useful on severs):screen
Create a new screen
with a name:screen -S <name>
Create a new screen
with logging enabled; screenlog.0
is created:screen -L <name>
Detach the current screen
:Ctrl+A+D
Resume a single screen
:screen -r
Resume a screen from multiple running screen:screen -r <name>
ORscreen -r <id>
List the currently running screens:screen -list
screen -ls
End a screen
:screen -X -S <id> quit
Shutdown the machine after finishing a command — separate them with ;
python3 file.py; sudo shutdown
For other shutdown commands, check this answer.
Adjust File permissions, access by the current user only:chmod 700 <file_name>
For example, this is required before using the *.pem key file provided by AWS E2.
Display RAM used:free -m
Display GPU memory used:nvidia-smi
Move a file from a server (e.g. AWS2) to the local Machine (run it from the local machine):scp <file_name> <user>@<serpver_ip:port>:/<dir_name>
Move a directory from a server (e.g. AWS2) to the local Machine; use -r
(run it from the local machine):scp -r <dir_name> <user>@<serpver_ip:port>:/<dir_name>
Move a file from AWS2 to the local Machine (run it from the local machine):scp -i <key.pem> <file_name> ubuntu@ec2[...].compute.amazonaws.com:~/<dir_name>
Move a file from the local machine to a server (run it from the local machine):scp <user>@<server_ip:port>:/<dir_name>/<file_name> </path/on/the/local/machine>
Move a file from Google Could to the local machine:gcloud compute scp --project <project_name> --recurse <user_name>@machine_name:~/<dir_name>/<file_name> </path/on/the/local/machine>
To move multiple files (actually, to run any command on multiple files), just add the difference between { } separated by a comma.
Log out of the current connection:Ctrl+D