Awk: A Swiss Army Knife for Text Manipulation
What is awk ? ๐
awk is a powerful programming language for working on text files. It’s particularly handy for extracting specific data from files, manipulating columns, and performing calculations on the fly. Awk is opensource and available on Github .
Awk’s key features ๐
- Pattern Matching: Awk excels at finding and processing lines that match specific patterns (like regular expressions).
- Field Separation: It automatically splits input lines into fields based on a delimiter (usually whitespace).
- Built-in Variables:
$0
: The entire input line.$1
,$2
,$3
, …: Individual fields within a line.NF
: The number of fields in a line.
- Actions: You define actions (code blocks) to be executed when a pattern is matched. These actions can involve:
- Printing data.
- Performing calculations.
- Creating new output.
Common examples of awk usage ๐
Extracting data ๐
Extract/print the second column (a.k.a. field) from a space-separated file:
awk '{print $2}' path/to/text/file.txt
Filtering lines ๐
Print lines that contain the word “keyword”:
awk '/keyword/ {print}' input.txt
Print the second column of the lines containing “foo” in a space-separated file:
awk '/foo/ {print $2}' path/to/file.txt
Explanation: if the line contains “foo”, then print the whole line onto the standard output (stdout).
How to print the last column of each line in a file, using a comma (instead of space) as a field separator:
awk -F ',' '{print $NF}' path/to/file
How to print every third line starting from the first line:
awk 'NR%3==1' path/to/file
How to print different values based on conditions:
awk '{if ($1 == "foo") print "Exact match foo"; else if ($1 ~ "bar") print "Partial match bar"; else print "Baz"}' path/to/file
How to print all lines where the 10th column value equals the specified value:
awk '($10 == value)'
How to print all the lines which the 10th column value is between a min and a max:
awk '($10 >= min_value && $10 <= max_value)'
Column Manipulation ๐
Swap the first and second columns:
awk '{print $2, $1}' input.txt
Data Calculations ๐
How to calculate the sum of the values in the first column of a file and print the total:
awk '{s+=$1} END {print "Sum: ", s}' path/to/file.txt
I hope this post helps you. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .