Graduate Program KB

awk Cheatsheet

  • An awk program has a basic structure... pattern { action }

  • We can run these awk programs 2 ways:

    • In-line: awk 'program' input-file1 input-file2 ... (preferred when program is short)
    • External file: awk -f program-file input-file1 input-file2 ... (preferred for longer programs)
  • If you run an awk program with no input files, it will apply the program to standard input.

  • File names don't matter and don't need a file extension.

  • We can also write executable awk programs:

    #! /bin/awk -f
    BEGIN { print "Don't Panic!" }
    
    • This can then be run by setting it as an executable and running it.
    • This is handy when we want the program to be run without anyone having to worry that it is written with awk.
  • Comments are done with #.

Quoting Rules

  • Quoted items can be concatenated with nonquoted items as well as with quoted items. The shell turns everything into one argument for the command.
  • Preceeding any single char with a \ quotes that char.
  • It is impossible to embed single quotes within a single-quote text.
  • Double quotes are the go to for embedding stuff in them, there are just some characters to be wary of and to escape: '$', '`', '' and '"'. All of these need to be preceeded with a backslash in order to be passed through the program in the string.

Syntax

  • Like I said before everything is a pattern followed by an action.
  • The BEGIN and END keywords are patterns for signalling what to do at the start and end of the program.
  • NR is built in varaible for keeping track of the total number of input records.
    • For example in a csv file, you can skip the header file with NR > 1.