Published Book on Amazon
All of IOT Starting with the Latest Raspberry Pi from Beginner to Advanced – Volume 1 | |
All of IOT Starting with the Latest Raspberry Pi from Beginner to Advanced – Volume 2 |
출판된 한글판 도서
최신 라즈베리파이(Raspberry Pi)로 시작하는 사물인터넷(IOT)의 모든 것 – 초보에서 고급까지 (상) | |
최신 라즈베리파이(Raspberry Pi)로 시작하는 사물인터넷(IOT)의 모든 것 – 초보에서 고급까지 (하) |
Original Book Contents
10.9.4 "uniq" Command
This command performs the function of removing adjacent redundant data when reading data from the input or exporting the data to the output.
[Command Format]
uniq [option] [input] [output] |
[Command Overview]
■ This removes adjacent redundant row data from input or output.
■ User privilege -- Normal user.
[Detail Description]
■ Because duplication is checked only for adjacent data, if the data is not sorted, this command will not remove redundant data.
■ After duplicate data is removed, the first data remains. Therefore, it is normal to sort the data with "sort" command and then execute this command.
[Main Option]
-c, --count | prefix lines by the number of occurrences |
-d, --repeated | only print duplicate lines |
-f, --skip-fields=N | avoid comparing the first N fields |
-i, --ignore-case | ignore differences in case when comparing |
-s, --skip-chars=N | avoid comparing the first N characters |
-u, --unique | only print unique lines |
[Used Example]
There is a file "customer_list_dup.txt" in the "testdata" directory of the "pi" account, and the contents are as follows.
pi@raspberrypi ~/testdata $ cat customer_list_dup.txt |
Microsoft IBM Samsung Samsung LG Microsoft Samsung Sony Hewlett-Packard |
In the above data, "Microsoft" and "Samsung" have several data. Now run the "uniq" command on this file.
pi@raspberrypi ~/testdata $ uniq customer_list_dup.txt |
Microsoft IBM Samsung LG Microsoft Samsung Sony Hewlett-Packard |
In the above result, the duplication of "Samsung" data is removed, and the "Microsoft" data is displayed as it is. This is because the "uniq" command only works on adjacent data.
To solve this problem, we will first sort the data using the "sort" command, and then run the "uniq" command. This time, execute the command as follows. Then, the "Microsoft" data is also displayed with the duplicated data removed.
pi@raspberrypi ~/testdata $ sort customer_list_dup.txt | uniq |
Hewlett-Packard IBM LG Microsoft Samsung Sony |
The "uniq" command can get more various informations by using various options. If you use "-c" option, you can get the number of duplicate data together.
pi@raspberrypi ~/testdata $ sort customer_list_dup.txt | uniq -c |
1 Facebook 1 Google 1 Hewlett-Packard 1 IBM 1 LG 2 Microsoft 3 Samsung 1 Sony |