Explore the uniq Command in Linux with Examples

exploring the linux uniq command

The uniq command in Linux is used to remove duplicate lines from a text file, but it only works on lines that appear consecutively. To ensure accurate results, data is often sorted before using uniq. This tool is commonly used to clean up text files or process data streams by filtering out repeated lines. With additional options, uniq can also display only duplicates or count how often each line appears, which makes it a useful utility for working with structured text data.

In this article, we will explore several use cases of the Linux uniq command to filter out duplicate lines, display repeated entries, and count occurrences in text files effectively.

How to Use uniq Command in Linux

To use the uniq command in Linux, you must follow the syntax given below:

uniq [OPTIONS] [INPUT_FILE [OUTPUT_FILE]]

Here, in this syntax, OPTIONS are optional switches that control how uniq processes the input. INPUT_FILE is the file you want to read from. If you don’t provide one, uniq takes input from the standard input (like your keyboard or piped data). And the OUTPUT_FILE is where the result will be saved. If omitted, the output appears directly in the terminal.

Commonly Used uniq Options

Below are some useful options you can use with the uniq command, listed in alphabetical order:

  • -c, –count: Adds the number of times each line appears at the beginning of the line.
  • -d, –repeated: Displays only the lines that appear more than once.
  • -f N, –skip-fields=N: Skips the first N fields on each line when making comparisons.
  • -i, –ignore-case: Ignores case differences while comparing lines (treats uppercase and lowercase as the same).
  • -s N, –skip-chars=N: Ignores the first N characters of each line when comparing.
  • -u, –unique: Shows only lines that do not repeat anywhere in the input

Exploring the uniq Command Linux with Practical Examples

The uniq command is compatible with all major Linux distributions. It’s part of the GNU Core Utilities (coreutils package), which comes pre-installed in almost every Linux distribution. In this section, we will walk you through some useful examples to demonstrate how the uniq command works in Linux:

Preparing a Sample Input File

Now, let’s understand how the uniq command works through a simple example. Imagine you have a text file named uh.txt that contains several repeated lines. Here’s what the content might look like:

sudo apt update
ls -l
ls -l
cd /var/log
cd /var/log
cd /var/log
sudo apt update
df -h
df -h
df -h
uname -r
uname -r
cat /etc/os-release
cat /etc/os-release
sudo apt update

This file includes duplicate lines that you might want to remove. You can use the uniq command to filter out those repeated lines and keep only the unique ones (from consecutive duplicates).

Removing Duplicate Lines from a File Using uniq

Let’s run the following command to filter the content of the uh.txt file that contains repeated lines:

uniq uh.txt

This command reads the contents of uh.txt and removes any consecutive repeated lines. Since we didn’t provide an output file, the cleaned result appears directly in our terminal:

remove consecutive duplicates

Showing Only Repeated Lines with the uniq Command

We can use the uniq command with the -d option to display only the lines that are repeated:

uniq -d uh.txt

This will print the lines that appear more than once in a row, hiding all unique entries:

show all repeated lines

Displaying Every Instance of Repeated Lines Using uniq

You can use the -D option with the uniq command to show all occurrences of duplicate lines, not just one from each group:

uniq -D uh.txt

This command returns every repeated line that appears consecutively in the file:

show every repeated line that appears consecutively

Counting Repeated Lines Using the uniq Command

We can use the uniq command with the -c option to count how many times each line appears in the input file:

uniq -c uh.txt

It returns each unique line from uh.txt, with the number of its consecutive occurrences shown at the beginning of the line:

show unique lines with numbering

Here, it’s important to note that non-adjacent duplicate lines are not counted together. To include them in the count, sort the file and pipe the output to the uniq command:

sort uh.txt | uniq -c

Now, the output shows that the uniq command counts all repeated lines accurately because sorting places identical lines next to each other:

sort before counting unique

Formatting the Output of the Uniq Command

The uniq -c command shows the count first, followed by the line itself. However, it doesn’t let you change the format or separator. If you want more control over how the output looks, you can use awk:

sort uh.txt | uniq -c | awk '{print $2 ": " $1}'

This command will display the line first, followed by its count, separated by a colon:

customize output of uniq command

Showing Only Non-Repeated Lines Using uniq

You can use the -u option with the uniq command to display lines that appear only once:

uniq -u uh.txt

This prints only the lines that are not repeated anywhere in the file (as long as they are not repeated consecutively):

show non-repeated lines

Although these lines are repeated but they are not adjacent, so uniq considers them as non-repeated.

Using the uniq Command to Skip the First N Fields and Characters

You can use the -f option to skip the first N fields when comparing lines. This is helpful when each line starts with different numbers or identifiers that you want to ignore:

sort uh.txt | uniq -f 2

By using -f 2, we’re telling uniq to skip the first two fields (like 1. and 2.) and compare the rest of the line. This allows the command to detect duplicates based on the actual command content rather than the numbering:

skip first N fields

Similarly, you can use the uniq command with the -s option to skip the first N characters:

sort uh.txt | uniq -s 3

The -s 3 option tells uniq to skip the first 3 characters of each line when comparing for duplicates. It only compares the content starting from the 4th character onward:

skip first N characters

Just like you can skip characters while comparing lines, you can also tell uniq to compare only a specific number of characters. To do this, use the -w option followed by the number of characters you want to consider:

sort uh.txt | uniq -w 4

This command compares only the first 4 characters of each line to determine duplicates. If the first 4 characters of two lines are the same, uniq treats them as duplicates, even if the rest of the lines are different:

compare first N characters of each line

Ignoring Letter Case While Comparing Lines with uniq

You can use the -i option with the uniq command to compare lines without considering uppercase or lowercase differences. For example, if we use the uniq command without the -i option, it will treat the “welcome to ultahost” and “WELCOME TO ULTAHOST” as two different lines because of the case difference. However, if we use the -i option, the second line is considered a duplicate of the first, so it’ll be removed from the output:

uniq -i uh.txt

The -i option is helpful when you want to treat words like “Linux” and “linux” as the same during comparison.

Conclusion

The uniq command is a useful command-line tool for handling duplicate lines in Linux text files. It helps clean up data by filtering out consecutive duplicates, and with options like -c, -d, -u, and -i, you can count, isolate, or ignore specific lines based on your needs. To get accurate results, especially with scattered duplicates, pairing uniq with sort is essential. In this article, we explained the working of the uniq Linux command along with practical examples.

We hope this guide helped you understand how the uniq command works in Linux. Consider Ultahost’s SSH VPS Hosting, which provides full root access, allowing you to use commands like uniq effectively for text processing, log management, and automation. With complete control over your server, you can streamline operations and handle data more efficiently.

FAQ

What does the uniq command do in Linux?
Why do I need to sort a file before using uniq?
How can I display only duplicate lines using uniq?
What’s the difference between -d and -D in uniq?
How do I count how many times each line appears?
How do I make the comparison case-insensitive?
How to show only lines that appear once?

Related Post

How to Install and Connect to Linux Server wi...

In the developing world of IT, remote access to servers...

Creating and Managing Files in Linux

Composed of multiple Operating Systems built on the Lin...

How to Use the Linux Head Command

File handling is a day-to-day task of Linux users and a...

How to Install OWASP ZAP in Kali Linux

For ethical hackers and security enthusiasts, Kali Linu...

How to Set or Change User Agent with curl

Curl is a valuable tool for web development and testing...

How to Use apropos Command in Linux

The apropos command in Linux is an essential tool for b...

Leave a Comment