some BASH topics

8 minute read

The more I use bash the more I find it interesting. Basically every time I encounter a useful bash commands or when I learn something new about a command, I write them down for future reference.

Quick links:

Online references:

Regular Expressions

Regular expressions(REGEX) are sets of characters and/or metacharacters that match patterns —- REGEX intro.

A REGEX can contain:

  • character set, eg: abcd
  • anchor, eg:^,$ to match beginning and end of string
  • modifiers, eg: ?,\

REGEX examples

  • ?: match any single characters
  • [abc]: match any characters enclosed
  • [c-g0-7]: match letters from c to g, or numbers from 0 to 7
  • [^b-y]: except
  • [a1-a2]: ascii code from a1 to a2
  • [!9]: NOT 9
  • +: match one or more if previous char
echo aaazz | grep "a\+z"
  • [:upper:] uppercase letters
  • [:lower:] lowercase letters
  • [:alpha:] alphabetic (letters) meaning upper+lower (both uppercase and lowercase letters)
  • [:digit:] numbers in decimal, 0 to 9
  • [:alnum:] alphanumeric meaning alpha+digits (any uppercase or lowercase letters or any decimal digits)
  • [:space:] whitespace meaning spaces, tabs, newlines and similar
  • [:graph:] graphically printable characters excluding space
  • [:print:] printable characters including space
  • [:punct:] punctuation characters meaning graphical characters minus alpha and digits
  • [:cntrl:] control characters meaning non-printable characters
  • [:xdigit:] characters that are hexadecimal digits.

Functions such as awk and egrep provide even richer set of metacharacters.

File Globbing

File Globbing and REGEX can be confusing. REGEX is used in functions for matching text in files, while globbing is used by shells to match file/directory names using wildcards.

Wildcards (some in REGEX may also apply):

  • *: match any string
  • {} is often used to extend list, eg:
    ls {a*,b*} lists files starting with either a or b.
  • []: same as in REGEX

Bash Arrays

  • Arrays can be constructed using round brackets:
    var=(item0 item1 item2) or
    var=($(ls -d ./))
  • To access items or change item values, we can use var[index]. Eg:
    var[index]=new_value
    echo ${var[index]}
    Note that when var is an array, the name var actually only refers to var[0]. To refer to the whole array, need to use var[@] or var[*].
  • sub-array expansion:
    • ${var[*]:s_ind} gives the subarray starting from index s_ind.
    • ${var[@]:s_ind:l} gives you the length l sub-array starting at index s_ind.
    • Can also replace @ with *.

Say courses is a symbolic link I created. If I cd this link, and then print working directory:

$ cd courses 
$ pwd
/Users/lizeng/paths/courses

It’s showing the symbolic path, not the absolute path. To get the absolute path, we can resolve the link through:

$ pwd -P 
/Users/lizeng/Google Drive/Yale/courses

# same also works for many other commands
$ cd courses; cd ..; pwd
/Users/lizeng/paths 
$ cd courses; cd -P ..;pwd 
/Users/lizeng/Google Drive/Yale

The -P here stands for physical (directory)

vim

  • In normal mode: all keys are functional keys. Examples are: -p: paste
    • yy: copy current row to clip board
    • dd: copy row to clip board and delete
    • u (ctrl+R): undo (redo) changes
    • hjkl: left, down, right, up
    • :help <command>: get help on a command — vim open the command txt file
    • :wq or :x: w for save; q for quit
    • :q!: quit without saving
  • Insertion:
    • o: insert a new row after current row
    • O: insert a new row before current row
    • a: insert after cursor
  • Cursor movement:
    • 0, :0: beginning of row, page
    • $, :$: end of row, page
    • ^: to first non-blank character
    • /pattern: search for pattern (press n to go to next)
    • H,M,L: move cursor to top, middle and bottom of page
    • Ctrl + E,Y: scroll up, down
    • Ctrl + u,d: half page up, down
    • w,W,e,E,b,B: jump cursor by words
  • Others: -:syntax on/off : turn on/off text-highlighting colorscheme

find

General syntax: find path -name **** -mtime +1 -newer 20160621 -size +23M ... We will introduce each of above parameters and some more in this section:

  • find ./ -name "*.txt" : searching by name
  • find ./ -type d -name "*LZ*": specify target type, d for directory, f for file.

  • find ./ -newerct 20130323 (or a file) : file created ct after the date (also could be a file). can also use newer just for modified time

  • find ./ -mtime (-ctime, -atime) +n :
    • m for modified time
    • c for creation time
    • a for access time
    • +n for greater than n days, similarly -n for within n days. Can also change measures
  • find ./ -name "PowerGod*" -maxdepth 3:
    set maximum searching depth in this directory; similarly use mindepth to set minimum searching depth

  • -iname : to ignore case

  • piping finded results: -exec cp {} ~/LZfolder/ \;: this command will copy the finded files to path ~/LZfolder/
    • finded file will be placed in the position of {} and execute the command

grep

grep is used for searching lines in a file with certain pattern strings. General formula: grep pattern filename
There are rich parameters you can specify:

  • grep abc$ file: match the end of a string

  • grep ^F file: match the beginning of string

  • grep -w over file: grep for words. In this example, words such as overdue, moreover would be skipped.

  • -A3: also show 3 lines after the lines found

  • -B3: show 3 lines before found lines

  • -C3: show 3 lines before and after

  • logical grep:

    • OR grep: grep pattern1|pattern2 filename
    • AND grep: grep pattern1.*pattern2 filename
    • NOT grep: grep -v pattern filename
      where -v stands for invert match

sed

sed is short for Stream EDitor General formula: sed 's/RegEx/replacement/g' file which will do the work of replacing RegEx with replacement.

  • the separator / could be replaced by something like _, |
    • eg: sed 's | age | year | ' file, and would still work.
  • simple back referencing, eg:

      $echo what | sed 's/wha/&&&/'  # input
      whawhawhat # output
    
  • more on back referencing, eg:

      echo 2014-04-01 | sed 's/\(....\)-\(..\)-\(..\)/\1+	\2+\3/'
      2014+04+01
    

Things in \(...\) are referred. A dot $\cdot$ in Regex can signify any character. Useful to use dots to describe patterns.

  • you can also sed multiple patterns separated by ;, eg: sed s/pattern1/replace1/;s/pattern2/replace2/g < file

head, tail

Shell scripting

Debugging:

  • bash (or sh) -v script.sh : displays each command as the program proceeds

  • bash (or sh) -x script.sh : displays values of variables as program runs

Others

Boolean value

You can try : false; echo $? The output is 1, which means in bash shell:
1 for false
0 for true

Different parenthesis and brackets

See Parenthesis difference.

  • Double parenthesis (arithmetic operator) :
    • (( expr )) : enables the usage of things like <, >, <= etc.
    • echo $(( 5 <= 3 )), and we get 0
    • arithmetic operator interprets 1 as true, and 0 as false, which is different from the test command

Braces

  • Used for parameter expansion. Can create lists which are often used in loops, eg:
$ echo {00..8..2} 
00 02 04 06 08 

Single and double square brackets

Much of below is from bash brackets, and bash test functions.

  • [ expression ] is the same as test expression. eg:
    test -e "$HOME" same as [ -e "$HOME" ]
    and both of them requires careful handling of escaping characters.

  • use -a, -o or ||, && for group testing. eg:

# the following are the same
$ test -e "file1" -a -d "file2"
$ test -e "file1" && test -d "file2"
$ [ -e "file1" ] && [ -d "file2" ]
$ [ -e "file1" -a "file2" ]

Note that [ expr1 ] -a [ expr2 ], [ expr1 && expr2 ] results in error.

  • [[ expression ]] allows you to use more natural syntax for file and string comparisons. If you want to compare number, it’s more common to use double brackets (( )).
    eg. [[ -e "file1" && -e "file2" ]].
    [[ ]] doesn’t support -a, -o inside.

Quotes

Things inside the same quote are considered as one variable.

  • Single quotes: preserves whatever inside
  • Double quotes: do not preserve words involving $ or \ and etc.

    See Quotes difference for more.

Environment Variables

  • $PS1: controls shell prompt
  • $PATH: when shell receives non-builtin command, it goes into $PATH to look for it.
  • $HOME: home directory

Easy command substitute

Say my previous command is vim project.txt. Now I want to open this file instead of using vim. Then I can simply input:

$ !vim:s/vim/open   

where $ is the shell prompt. Basically this is performing sed on whatever the results are from !vim.

Redirection

Bash shell has 3 basic streams: input(0), output(1), and error(2). We can use #number> to redirect them to somewhere else, eg:

  • Input redirection:
    • < or 0<
    • command << EOF, and then manually input argument file, using EOF to end inputting (or use ctrl + D). << is here document symbol.
    • command <<< string : it’s here string symbol. Can input a one row string argument.
  • output redirection:
    • > or 1>: redirect output
    • 2>: redirect error log
    • 2>1& : direct stderr to stdout stream, copy where stdout goes. And 1>2& means vice versa. Here the >& is a syntax to pipe one stream to another.
    • &> filename: join stdout and stderr in one stream, and put in a file.
# input redirection
$ ./myprog < file1.txt
# output and err redirection
$ ./myprog arg1 > file.out
$ ./myprog arg1 2> file.err
$ ./myprog arg1 &> out_and_err
  • Want no output
    Use command > /dev/null 2>&1

~/.bash_profile, ~/.profile and ~/.bashrc

These are files where you can personalize commands to be executed upon shell login.

A bash shell would look for ~/.bash_profile first. If it does not exist, it executes ~/.profile.

When you start a shell in an existing session (such as screen), you get an interactive, non-login shell. That shell may read configurations in ~/.bashrc.

See discussions:
login, non-login
different startup files

Command substitution

If we want to use the output of command 1 in a sentence, we can do it in the following two ways:

... ...  `command 1` ... ... #  method 1   
... ... $(command 1)  ... ... # method 2   

Tags:

Updated: