› All Posts › programming › How to extract lines with specific character count from a text file in Linux terminal ?

How to extract lines with specific character count from a text file in Linux terminal ?

Mar 26, 2025 · 586 words · 3 minute read

I faced a need to extract the text lines which contains a specific character count from a huge text file. I searched online for all possible answers and tried them all. Here is what I found.

Using grep 🔗

grep is a command-line utility for searching plain-text datasets for lines that match a regular expression. Its name comes from the ed command g/re/p (global regular expression search and print), which has the same effect.

To use grep, we use -E flag with a regular expression to get the lines with a specific character count.

Here is the command to get the 1 character length lines from a plain-text file called pt.txt:

grep -E '^.{1}$' pt.txt

And here is a command to extract lines that have only 2 characters from a plain text file called pt.txt:

grep -E '^.{2}$' pt.txt

And here is a command to extract text lines that are only 8 characters long from a plain text file called pt.txt:

grep -E '^.{8}$' pt.txt

Using egrep 🔗

egrep is just a script to run grep -E command.

egrep is grep -E

So, you can use the command like this.

egrep '^.{1}$' pt.txt

In the above command, it extracts every line that has only two characters long.

Here is the command to get all lines that contain only 10 characters:

egrep '^.{10}$' pt.txt

Using awk 🔗

AWK (/ɔːk/) is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and it is a standard feature of most Unix-like operating systems.

To get the length of the current line, we use length() function in awk. So, we can get the length of current line and compare it with == to the number of characters we want to checks if the length of the current line is exactly the character count we specified. If the condition is true, awk prints the line by default. That’s great! let’s do that!

awk 'length() == 4' pt.txt

In the above command, awk will print on the standard output (stdout) all the line that have 4 characters only.

If you want to extract all lines that contain 12 characters, use this awk command.

awk 'length() == 12' pt.txt

Use-cases 🔗

That task of extracting each line of a specific length can be needed in various scenarios in different programming domains such as data science, defensive/offensive security, and other domains.

Offensive security experts may use it to extract passwords of specific length form plain-text dataset (called wordlists or passwordlists).

Notes 🔗

Accuracy of results 🔗

I faced a weird situation when using the three commands. Awk always extracts more results which are accurate.

accuracy of grep, egrep, and awk

So, why grep didn’t grep them all? unfortunately, I DON’T KNOW.

Performance and execution speed 🔗

I noticed another thing. Awk is slower than grep (or egrep). grep command executed in 47 milliseconds, and egrep command executed in 48 milliseconds. But awk command executed in more than 12 WHOLE seconds. Awk is significatly slower than grep.

benchmarking grep, egrep, and awk

My opinion and preference 🔗

I prefer awk over grep in this case because of its accuracy I found in experiments. Its slow execution is not significant for me.

I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .

Tags: Programming

Translations: العربية (كيف تستخرج الأسطر ذات عدد الحروف المحدد من ملف نصي في لينكس ؟)

How to extract lines with specific character count from a text file in Linux terminal ?

Using grep 🔗

Using egrep 🔗

Using awk 🔗

Use-cases 🔗

Notes 🔗

Accuracy of results 🔗

Performance and execution speed 🔗

My opinion and preference 🔗

See Also

Why Microsoft is porting Typescript Compiler into Go ?

How to Create an Unhackable Password ?

Linux awareness, Distrohopping, Ricing, and Growing up

How to set Win+V shortcut for Clipboard History in Ubuntu Linux ?

The Siren Song of Representative Samples: Why Telemetry Still Reigns Supreme in Software Development

Mastering Conda Environments: Your Ultimate Cheat Sheet and In-Depth Guide

gobrew | how many apps written in a specific programming language ?

Why I created Deeper Dark Color Theme for VS Code ?

How to change Fish Greeting message to my name in ASCII Art

Medical & Health Advices to Programmers, Software Developers, and Engineers