How to Install and Connect to Linux Server wi...
In the developing world of IT, remote access to servers...
The uniq command in Linux is used to remove duplicate lines from a text file, but it only works on lines that appear consecutively. To ensure accurate results, data is often sorted before using uniq. This tool is commonly used to clean up text files or process data streams by filtering out repeated lines. With additional options, uniq can also display only duplicates or count how often each line appears, which makes it a useful utility for working with structured text data.
In this article, we will explore several use cases of the Linux uniq command to filter out duplicate lines, display repeated entries, and count occurrences in text files effectively.
To use the uniq command in Linux, you must follow the syntax given below:
uniq [OPTIONS] [INPUT_FILE [OUTPUT_FILE]]
Here, in this syntax, OPTIONS are optional switches that control how uniq processes the input. INPUT_FILE is the file you want to read from. If you don’t provide one, uniq takes input from the standard input (like your keyboard or piped data). And the OUTPUT_FILE is where the result will be saved. If omitted, the output appears directly in the terminal.
Experience Ultahost’s Cheap Linux VPS!
Ultahost Linux VPS gives you complete control and flexibility. It handles all the administrative tasks to ensure your servers stay fast and reliable!
Below are some useful options you can use with the uniq command, listed in alphabetical order:
The uniq command is compatible with all major Linux distributions. It’s part of the GNU Core Utilities (coreutils package), which comes pre-installed in almost every Linux distribution. In this section, we will walk you through some useful examples to demonstrate how the uniq command works in Linux:
Now, let’s understand how the uniq command works through a simple example. Imagine you have a text file named uh.txt
that contains several repeated lines. Here’s what the content might look like:
sudo apt update ls -l ls -l cd /var/log cd /var/log cd /var/log sudo apt update df -h df -h df -h uname -r uname -r cat /etc/os-release cat /etc/os-release sudo apt update
This file includes duplicate lines that you might want to remove. You can use the uniq command to filter out those repeated lines and keep only the unique ones (from consecutive duplicates).
Let’s run the following command to filter the content of the uh.txt
file that contains repeated lines:
uniq uh.txt
This command reads the contents of uh.txt
and removes any consecutive repeated lines. Since we didn’t provide an output file, the cleaned result appears directly in our terminal:
We can use the uniq command with the -d option to display only the lines that are repeated:
uniq -d uh.txt
This will print the lines that appear more than once in a row, hiding all unique entries:
You can use the -D option with the uniq command to show all occurrences of duplicate lines, not just one from each group:
uniq -D uh.txt
This command returns every repeated line that appears consecutively in the file:
We can use the uniq command with the -c option to count how many times each line appears in the input file:
uniq -c uh.txt
It returns each unique line from uh.txt
, with the number of its consecutive occurrences shown at the beginning of the line:
Here, it’s important to note that non-adjacent duplicate lines are not counted together. To include them in the count, sort the file and pipe the output to the uniq command:
sort uh.txt | uniq -c
Now, the output shows that the uniq command counts all repeated lines accurately because sorting places identical lines next to each other:
The uniq -c command shows the count first, followed by the line itself. However, it doesn’t let you change the format or separator. If you want more control over how the output looks, you can use awk:
sort uh.txt | uniq -c | awk '{print $2 ": " $1}'
This command will display the line first, followed by its count, separated by a colon:
You can use the -u option with the uniq command to display lines that appear only once:
uniq -u uh.txt
This prints only the lines that are not repeated anywhere in the file (as long as they are not repeated consecutively):
Although these lines are repeated but they are not adjacent, so uniq considers them as non-repeated.
You can use the -f option to skip the first N fields when comparing lines. This is helpful when each line starts with different numbers or identifiers that you want to ignore:
sort uh.txt | uniq -f 2
By using -f 2, we’re telling uniq to skip the first two fields (like 1. and 2.) and compare the rest of the line. This allows the command to detect duplicates based on the actual command content rather than the numbering:
Similarly, you can use the uniq command with the -s option to skip the first N characters:
sort uh.txt | uniq -s 3
The -s 3 option tells uniq to skip the first 3 characters of each line when comparing for duplicates. It only compares the content starting from the 4th character onward:
Just like you can skip characters while comparing lines, you can also tell uniq to compare only a specific number of characters. To do this, use the -w option followed by the number of characters you want to consider:
sort uh.txt | uniq -w 4
This command compares only the first 4 characters of each line to determine duplicates. If the first 4 characters of two lines are the same, uniq treats them as duplicates, even if the rest of the lines are different:
You can use the -i option with the uniq command to compare lines without considering uppercase or lowercase differences. For example, if we use the uniq command without the -i option, it will treat the “welcome to ultahost” and “WELCOME TO ULTAHOST” as two different lines because of the case difference. However, if we use the -i option, the second line is considered a duplicate of the first, so it’ll be removed from the output:
uniq -i uh.txt
The -i option is helpful when you want to treat words like “Linux” and “linux” as the same during comparison.
The uniq command is a useful command-line tool for handling duplicate lines in Linux text files. It helps clean up data by filtering out consecutive duplicates, and with options like -c, -d, -u, and -i, you can count, isolate, or ignore specific lines based on your needs. To get accurate results, especially with scattered duplicates, pairing uniq with sort is essential. In this article, we explained the working of the uniq Linux command along with practical examples.
We hope this guide helped you understand how the uniq command works in Linux. Consider Ultahost’s SSH VPS Hosting, which provides full root access, allowing you to use commands like uniq effectively for text processing, log management, and automation. With complete control over your server, you can streamline operations and handle data more efficiently.
The uniq command filters out repeated lines in a text file, but only if they appear consecutively. It’s useful for cleaning or analyzing text data.
Because uniq only removes adjacent duplicates, sorting ensures that all duplicate lines are grouped together so uniq can detect and filter them properly.
You can run uniq -d filename.txt to show lines that are repeated and adjacent.
The -d option displays only one instance of each duplicate group, while -D prints all occurrences of those duplicate lines.
Use uniq -c filename.txt to prefix each line with its occurrence count.
Run uniq -i filename.txt to treat uppercase and lowercase letters as the same.
Use uniq -u filename.txt to filter out all repeated lines and display only unique ones.
UltaAI – Smart AI Assistant for Ultahost Clients
UltaAI serves as your go-to advisor for all things related to domains and web hosting. Get tailored recommendations designed to meet your specific needs.