Awk & Sed Introduction and Printing Operations
— ny_wk

Awk and sed are the two text-processing power tools every Linux sysadmin should master: awk slices files into records and fields to build formatted reports, while sed is a stream editor that finds, substitutes, deletes, and rewrites text on the fly. This guide walks through both tools from first principles with corrected, copy-paste-ready examples you can run on any modern Linux or macOS shell.
If you have only ever used sed to swap one word for another, you are using a fraction of its power. The same goes for awk, which is a full programming language with variables, conditions, loops, and arithmetic. Learning awk and sed together turns one-off manual edits into repeatable, scriptable pipelines.
What Awk is and why sysadmins use it
Awk is a pattern-scanning and data-extraction language named after its three creators at Bell Labs: Alfred Aho, Peter Weinberger, and Brian Kernighan. It reads input line by line, splits each line into fields, and runs the actions you define whenever a line matches a pattern.
The version installed on most Linux distributions is actually gawk (GNU awk), symlinked to awk. It is ideal for log analysis, CSV manipulation, and one-line reports. Key characteristics:
- Awk treats a text file as a set of records (lines by default) and fields (words by default).
- It supports variables, conditionals, loops, arrays, and arithmetic and string operators.
- It reads from a file or standard input and writes to standard output, so it slots cleanly into pipes.
- It is designed for structured text — do not point it at binary files.
Awk syntax and working model
The general form of an awk program is a series of pattern { action } rules:
awk 'pattern { action }' file- The pattern is a condition or regular expression that selects lines.
- The action (inside braces) is the statement(s) to run on matching lines; separate multiple statements with semicolons.
- Single quotes wrap the program so the shell does not expand
$,*, or other special characters.
Either the pattern or the action is optional, but not both. With no pattern, the action runs on every line. With no action, the default action is to print the whole matching line. Note the difference between omitting the braces and writing empty braces: {} does nothing at all and suppresses the default print.
A sample data file
All awk examples below use this whitespace-separated employee.txt. Each row is ID, Name, Role, Department, Salary:
100 Thomas Manager Sales $5,000
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
Awk printing operations, field by field
These are the everyday awk patterns that cover the bulk of real sysadmin work.
1. Print every line (default behaviour)
With no pattern and a bare print, awk echoes each line. print with no argument prints $0, the whole record:
awk '{ print }' employee.txt
This is functionally the same as cat employee.txt, but it confirms awk is reading the file correctly before you add logic.
2. Print only the lines that match a pattern
To print rows containing either "Thomas" or "Nisha", use the logical OR operator || between two regex patterns. A common copy-paste error is to write /Thomas/ > /Nisha/; the > is a comparison, not OR, and will not do what you expect.
awk '/Thomas/ || /Nisha/' employee.txt
100 Thomas Manager Sales $5,000
400 Nisha Manager Marketing $9,500
3. Print specific fields
Awk automatically splits each record on whitespace and stores the pieces in $1, $2, and so on; $0 is the entire line. The built-in variable NF holds the Number of Fields, so $NF is the last field. To print the name and salary columns:
awk '{ print $2, $5 }' employee.txtawk '{ print $2, $NF }' employee.txt
Thomas $5,000
Jason $5,500
Sanjay $7,000
Nisha $9,500
Randy $6,000
The comma between fields inserts the output field separator (a space by default). Omit the comma — print $2 $5 — and awk concatenates the values with no space.
4. BEGIN and END blocks for headers and footers
Two special patterns control setup and teardown. The BEGIN block runs once before any input is read; the END block runs once after the last line is processed. They are perfect for printing report headers, initialising counters, and printing summaries. In awk, # starts a comment.
awk 'BEGIN { print "Name\tDesignation\tDept\tSalary" } { print $2"\t"$3"\t"$4"\t"$NF } END { print "----\nReport generated" }' employee.txt
Name Designation Dept Salary
Thomas Manager Sales $5,000
Jason Developer Technology $5,500
Sanjay Sysadmin Technology $7,000
Nisha Manager Marketing $9,500
Randy DBA Technology $6,000
----
Report generated
For perfectly aligned columns, reach for printf instead of print, for example printf "%-8s %-10s\n", $2, $3.
5. Numeric field comparisons
Patterns can test field values numerically. To list employees whose ID is greater than 200, compare $1 directly — awk treats it as a number in numeric context:
awk '$1 > 200' employee.txt
300 Sanjay Sysadmin Technology $7,000
400 Nisha Manager Marketing $9,500
500 Randy DBA Technology $6,000
6. Match a field against a regular expression
The ~ operator means "matches this regex" (and !~ means "does not match"). To print everyone in the Technology department, test the fourth field:
awk '$4 ~ /Technology/' employee.txt
200 Jason Developer Technology $5,500
300 Sanjay Sysadmin Technology $7,000
500 Randy DBA Technology $6,000
7. Count matching records
Combine a pattern, a counter, and an END block to produce a tally. Variables in awk default to zero, so the explicit BEGIN initialisation is optional but tidy:
awk '$4 ~ /Technology/ { count++ } END { print "Employees in Technology =", count }' employee.txt
Employees in Technology = 3
Swap in { sum += $1 } to total a column, or build an associative array (dept[$4]++) to group counts by department in a single pass — that is where awk leaves simple grep far behind.
Sed: the stream editor for automated text edits
Sed reads input line by line, applies your editing commands to each line in a temporary buffer (the "pattern space"), and prints the result. Because it is non-interactive and reads from a stream, it is perfect for scripted, repeatable edits across many files. Most engineers only ever use its substitute command, but sed can delete, insert, append, transform, and filter too.
A sample data file for sed
The sed examples use this file.txt:
unix is great os. unix is opensource. unix is free os.
learn operating system.
unixlinux which one you choose.
Important safety note: by default sed writes the result to standard output and does not modify the file. To edit the file in place, add the -i flag — and always test without it first, or use -i.bak to keep a backup copy.
1. Substitute the first match on each line
The substitute command is s/pattern/replacement/. By default it replaces only the first occurrence on each line:
sed 's/unix/linux/' file.txt
linux is great os. unix is opensource. unix is free os.
learn operating system.
linuxlinux which one you choose.
The s is the substitute operation, the / characters are delimiters, the first segment is the search regex, and the second is the replacement.
2. Replace the Nth occurrence on a line
Add a number flag to target a specific occurrence. This replaces the second "unix" on each line:
sed 's/unix/linux/2' file.txt
3. Replace every occurrence with the g flag
The global flag g replaces all matches on each line:
sed 's/unix/linux/g' file.txt
linux is great os. linux is opensource. linux is free os.
learn operating system.
linuxlinux which one you choose.
4. Replace from the Nth occurrence onward
Combine a number with g to replace from the Nth match to the end of the line. 3g means "from the third occurrence onward". On the first sample line, the three "unix" instances appear before "great", "opensource", and "free", so only the third is changed:
sed 's/unix/linux/3g' file.txt
unix is great os. unix is opensource. linux is free os.
learn operating system.
unixlinux which one you choose.
5. Change the delimiter
When the pattern itself contains slashes (such as a URL), escaping every / gets ugly. Sed lets you use any character as the delimiter — just put it right after the s. These three commands are equivalent:
sed 's/http:\/\//www/' file.txtsed 's_http://_www_' file.txtsed 's|http://|www|' file.txt
6. Reuse the matched text with &
In the replacement, & stands for the entire matched string. This is handy when you want to wrap or duplicate a match rather than replace it:
sed 's/unix/{&}/' file.txtwraps the first "unix" in braces:{unix}.sed 's/unix/{&&}/' file.txtduplicates it:{unixunix}.
7. Capture groups with backreferences \1 to \9
Parentheses (escaped in basic regex as \( and \)) create capture groups you can reference in the replacement as \1, \2, and so on. Examples:
- Double a word:
sed 's/\(unix\)/\1\1/' file.txtturns the first "unix" into "unixunix". - Swap two adjacent words:
sed 's/\(unix\)\(linux\)/\2\1/' file.txtturns "unixlinux" into "linuxunix". - Reverse the first three characters of each line:
sed 's/^\(.\)\(.\)\(.\)/\3\2\1/' file.txt.
Tip: GNU sed's -E (extended regex) flag lets you drop the backslashes — sed -E 's/(unix)(linux)/\2\1/' — which is far more readable.
8. Print the changed line twice with the p flag
The p flag prints the pattern space again after substitution, so substituted lines appear twice and unchanged lines appear once:
sed 's/unix/linux/p' file.txt
9. Print only the substituted lines
Pair -n (suppress automatic printing) with the p flag to show only the lines where a substitution happened — a clean way to see exactly what changed:
sed -n 's/unix/linux/p' file.txt
Used alone, -n suppresses all output.
10. Chain multiple edits
Run several substitutions in one pass with repeated -e options (cleaner than piping sed into sed):
sed -e 's/unix/linux/' -e 's/os/system/' file.txt- Equivalent pipe:
sed 's/unix/linux/' file.txt | sed 's/os/system/'
11-13. Target specific lines or matched lines
You can scope a command to a line number, a range, or lines matching a pattern (an "address"):
| Command | What it does |
sed '3 s/unix/linux/' file.txt | Substitute only on line 3. |
sed '1,3 s/unix/linux/' file.txt | Substitute on lines 1 through 3. |
sed '2,$ s/unix/linux/' file.txt | Substitute from line 2 to the last line ($). |
sed '/linux/ s/unix/centos/' file.txt | On lines containing "linux", replace "unix" with "centos". |
14-16. Delete, duplicate, and filter like grep
- Delete lines:
sed '2d' file.txtremoves line 2;sed '5,$d' file.txtremoves line 5 to the end. - Duplicate every line:
sed 'p' file.txtprints each line twice. - Act like grep:
sed -n '/unix/p' file.txtprints only matching lines (same asgrep unix);sed -n '/unix/!p' file.txtinverts the match (same asgrep -v unix), where!negates the address.
17-19. Append, insert, and change whole lines
Sed can add or replace entire lines around a match:
- Append after a match with
a:sed '/unix/a Added after' file.txt. - Insert before a match with
i:sed '/unix/i Added before' file.txt. - Change the whole matched line with
c:sed '/unix/c Changed line' file.txt.
20. Transliterate characters with y
The y command maps characters one-to-one, like tr. This uppercases every "u" and "l":
sed 'y/ul/UL/' file.txt
Unix is great os. Unix is opensoUrce. Unix is free os.
Learn operating system.
UnixLinUx which one yoU choose.
Common pitfalls with awk and sed
- Forgetting the quotes. Always single-quote awk and sed programs so the shell does not expand
$or*. - Assuming sed edits the file. Without
-i, sed only prints to stdout. With-i, it overwrites permanently — use-i.bakuntil you trust the command. - Confusing
>with logical OR. In awk, combine patterns with||, not>. - Whitespace columns vs. real delimiters. Awk's default split is on runs of whitespace; for CSV use
-F','and remember quoted commas need a proper parser. - BRE vs. ERE. In basic sed you must escape
\(,\),\+; switch tosed -E/awk(which uses ERE) for cleaner patterns. - Greedy global replace. The
gflag changes every match on a line; if you meant only the first, drop it.
Verification: confirm your edits did what you expected
- Dry-run first. Run sed without
-iand read the output before committing. - Diff before and after.
sed 's/unix/linux/g' file.txt | diff file.txt -shows exactly which lines change. - Count matches.
grep -c unix file.txtbefore, and re-check after, to confirm the expected number of replacements. - Validate awk field logic by printing
NFand$0:awk '{ print NF": "$0 }' file.txtreveals mis-split rows. - Keep a backup. Use
cp file.txt file.txt.bakorsed -i.bakso you can roll back instantly.
Key Takeaways
- Awk is a field-aware programming language for extracting columns and building reports; sed is a stream editor for find-and-replace and line edits.
- In awk,
$1..$NFare fields,$0is the whole line, andBEGIN/ENDblocks handle setup and summaries. - In sed,
s/old/new/replaces the first match; addgfor all, a number for the Nth, and-n .../pto show only changes. - Sed does not touch your file unless you pass
-i— always dry-run first and keep a.bak. - Use
||for OR in awk, capture groups (\1/\2) in sed, and-Efor cleaner extended regex in both.
Frequently Asked Questions
What is the difference between awk and sed?
Sed is a stream editor focused on line-oriented find, replace, insert, and delete operations. Awk is a full programming language that understands records and fields, so it excels at column extraction, calculations, and formatted reports. Use sed for quick text substitution and awk when you need field logic, arithmetic, or grouping.
How do I edit a file in place with sed?
Use the -i flag: sed -i 's/old/new/g' file.txt. Because this overwrites the file permanently, run it first without -i to preview, or use sed -i.bak 's/old/new/g' file.txt to automatically keep a backup named file.txt.bak.
How do I print a specific column with awk?
Awk splits each line into fields on whitespace by default. Print the second column with awk '{ print $2 }' file, or the last column with awk '{ print $NF }' file. For comma-separated files, set the field separator: awk -F',' '{ print $2 }' file.csv.
What does the g flag mean in sed?
The g (global) flag tells sed to replace every match on each line, not just the first. Without it, s/unix/linux/ changes only the first "unix" per line; with it, s/unix/linux/g changes all of them.
If this helped you tame the command line, subscribe to @explorenystream on YouTube for more Linux and sysadmin tutorials.