awk
Tags: computers
- https://ferd.ca/awk-in-20-minutes.html
- example: https://github.com/ferd/recon/blob/master/script/queue_fun.awk
- another: http://c2.com/doc/expense/
- https://www.youtube.com/watch?v=43BNFcOdBlY
- https://www.youtube.com/watch?v=4UGLsRYDfo8
Structure
-
Everything in awk is a pattern -> action
Pattern1 { ACTIONS; } Pattern2 { ACTIONS; } -
Each line will go through each of the patterns, one at a time
Data Types
- awk only has strings and numbers
- strings can be cast into numbers
- both can be assigned to variables in
ACTIONSwith the=operator - variables can be declared anywhere, uninitialized variables have
""empty string value - awk has unidimensional associative arrays
var[key] = value
- can simulate multidimensional arrays, but not very good
Patterns
Regular and Boolean Expressions
- supports non-pcre regex (
gawkmight) - patterns can’t capture specific groups, only to match
- boolean expressions have
&&||and! - note that
==does fuzzy matching (like js) - booleans can be used alongside regex
/admin/ || debug == true { ACTION }
- specific string matching against regex can use
~or!~ - can also just use
{ACTIONS}to have it run against every line
Special Patterns
BEGINmatches only before any line has been input to the file, can initiate variables and other stateENDlets you do some final cleanup- Fields see below
Fields
# According to the following line
#
# $1 $2 $3
# 00:34:23 GET /foo/bar.html
# \_____________ _____________/
# $0
# Hack attempt?
/admin.html$/ && $2 == "DELETE" {
print "Hacker Alert!";
}
- Fields are default separated by whitespace
- You can modify the line by assigning to the field
Actions
Many possible actions, most useful ones:
{ print $0; } # prints $0. In this case, equivalent to 'print' alone
{ exit; } # ends the program
{ next; } # skips to the next line of input
{ a=$1; b=$0 } # variable assignment
{ c[$1] = $2 } # variable assignment (array)
{ if (BOOLEAN) { ACTION }
else if (BOOLEAN) { ACTION }
else { ACTION }
}
{ for (i=1; i<x; i++) { ACTION } }
{ for (item in c) { ACTION } }
ALL VARIABLES ARE GLOBAL
Functions
Functions can be called with typical syntax { somecall($2) }
Built-in functions: https://www.gnu.org/software/gawk/manual/html_node/Built_002din.html#Built_002din
User defined functions:
# function arguments are call-by-value
function name(parameter-list) {
ACTIONS; # same actions as usual
}
# return is a valid keyword
function add1(val) {
return val+1;
}
Special Variables
BEGIN { # Can be modified by the user
FS = ","; # Field Separator
RS = "\n"; # Record Separator (lines)
OFS = " "; # Output Field Separator
ORS = "\n"; # Output Record Separator (lines)
}
{ # Can't be modified by the user
NF # Number of Fields in the current Record (line)
NR # Number of Records seen so far
ARGV / ARGC # Script Arguments
}