File/Text Processor Commands in Linux

File/Text Processor Commands in Linux

In Linux, we gonna mostly use text or file processor commands for different kinds of tasks to print out the desired results. It is very powerful that gives a huge advantage to Linux over any other operating system (OS).

  1. cut command

  2. awk command

  3. sort command

  4. grep or egrep command

  5. uniq command

  6. wc command

1 ) cut command :

-> It is a command line utility that allows you to cut the parts of lines from specified files or piped data (|) and print the result to standard input.

-> It can be used to cut parts of a line by delimiter, byte position, and character.

$ cut -c1 file
$ cut -c1-n file

In the first command "cut" is a command "-c" stands for character "1" which is the first character of the file.

In the second command "-c1-n" n stands for any number. Basically, you will give the range of characters that you want to cut from the file.

In place of "c" if you go with "b" the output you"ll get is the same. (b stands for byte)

$ cut -d: -f6 /etc/passwd

In this command "-d" stands for a delimiter that separates the colon from the output and then prints the desired results. Basically, it will print the 6th column of /etc/passwd which was separated by a colon (:).

You can provide a range also like "(-d: -f6-8)" which will print the output of the 6th to 8th column.

2 ) awk command :

-> It is a very powerful command which is used for data extraction. Most of the time it is used to extract fields from a file or from an output.

$ awk '{print $1}' file

It will basically print the 1st field of the file. Here $1 will print the 1st field of the file, $0 will print the whole line, and so on.

$ awk '{print $NF}' file

Here $NF will print the last column of the file.
You can also pipe it with another command like

$ ls -l | awk '{print $1,$3}'

First, it will take the output of the "ls -l" command then the final output it will print the "$1 and $3" which means 1st and 3rd columns of ls -l output.

$ echo "Hello Tom" | awk '{$2="Developers";print $0}'

In this command, it will first echo the "Hello Tom" and then "$2" will take the second column of "Hello Tom" i.e Tom will replace by Developers, then "$0" will print the whole sentence that will be "Hello Developers".

3 ) 'grep' or 'egrep' command :

grep: Stands for Global Regular Expression that print text line by line which matches a specified pattern in the file.

$ grep "keyword" file

Here in the place of "keyword", you can write any word that you want to find from the file.

In the grep command you can add -c,-i,-v,-n before the keyword.

  1. "-c" : It counts the number of words that you had searched for and prints how many words are present in the file.

  2. "-i" : It will ignore the case-sensitive from your file. Suppose in your file you have 2 words that are the same ("File", "file") it will print both the keywords in this case, but if you are not using "-i" then it will print only that keyword that you had given to print.

  3. ''-v'' : It displays all the contents that are matched with the given keyword.

  4. "-n" : It displays the matched keyword lines with row numbers.

$ egrep -i "keyword 1 | keyword 2" file

In this command, you can pipe many keywords for searching in the file. It will print all the case-sensitive words that you searched for in any file.

4 ) sort/uniq command :

-> It will sort the output in alphabetical order.

$ sort file
$ sort -r file

In the first command, it will sort all the content in alphabetical order of file.

In the second command, it will sort the content in reverse alphabetical order.

-> uniq command will remove all the duplicates but this command does not work alone you have to pipe the sort command with this.

$ sort file | uniq 
$ sort file | uniq -d
$ sort file | uniq -c

In the first command, it will sort the file and then remove all the duplicates from the given file.

In the second command, it will again sort the file but now it will print only repeated words.

In the third command, it will sort all the contents of the file then it will print the output with the number that is repeated in the given file.

5 ) wc command :

-> This command will print a list of files and generates newline count, word count, and byte count.

$ wc file
$ wc -l file
$ wc -c file

In the first command, if you execute "wc file" then you will get the line with three columns i.e line count, word count, and byte count. (x,y,z)

x = Line count

y = word count

z = byte count

In the second command, it will print the number of lines from a file.

In the third command, it will count the words from a file and then prints the number.

That's all !!

I hope you all readers will learn new something and find this article useful.

THANK YOU !!

You can follow me on Twitter: Amit Maurya

And be sure to share with other Linux folks that"ll gonna surely help in learning Linux.