Debug School

rakesh kumar
rakesh kumar

Posted on

How to retrive data by manipulating data using awk commands

referene
referene
referene
referene
referene
referene
referene

.
1.What is awk command in linux and why we use it?
2.What are the Variables In Awk.
3.Explain follwing terms from referene

  • Awk option
  • Awk Preprocessing and Postprocessing
  • Built-in Variables
  • User Defined Variables
  • Structured Commands
  • loop
  • Formatted Printing
  • Built-In Functions
  • String Functions
  • User Defined Functions 4.How to prints every line of data from the specified file. ? 5.How to prints Name and Salary fields from the specified file(two methods)’. . 6.How to prints every line of data from the specified file. 7.What are the Built-In Variables In Awk. 8.How to prints all the lines along with the line number. 9.How to prints the line number 3 to 6.

10.How To print the first item along with the row number(NR) separated with ” – “ from each line in geeksforgeeks.txt .
11.How To return the second column/item from geeksforgeeks.txt.
12.How To print any non empty line if present .
13.How To count the lines in a file .

  1. How to To find the length of the longest line present in the file .
  2. How to Printing lines with more than 10 characters .

16.How To find/check for any string in any specific column.
17.How To print the squares of first numbers from 1 to n say 6

18.How To To separate the output by a '-' of given file.

  1. How Calculate the sum of a particular column .
  2. How to Printing lines with more than 10 characters

What is awk command in linux and why we use it
Awk is a scripting language used for manipulating data and generating reports. The awk command programming language requires no compiling and allows the user to use variables, numeric functions, string functions, and logical operators.

Awk is a utility that enables a programmer to write tiny but effective programs in the form of statements that define text patterns that are to be searched for in each line of a document and the action that is to be taken when a match is found within a line. Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then perform the associated actions

1. AWK Operations: 
(a) Scans a file line by line 
(b) Splits each input line into fields 
(c) Compares input line/fields to pattern 
(d) Performs action(s) on matched lines 

2. Useful For: 
(a) Transform data files 
(b) Produce formatted reports 

3. Programming Constructs: 
(a) Format output lines 
(b) Arithmetic and string operations 
(c) Conditionals and loops 
Enter fullscreen mode Exit fullscreen mode

Syntax:

awk options 'selection _criteria {action }' input-file > output-file
Enter fullscreen mode Exit fullscreen mode

*Options: *

-f program-file : Reads the AWK program source from the file 
                  program-file, instead of from the 
                  first command line argument.
-F fs            : Use fs for the input field separator
Enter fullscreen mode Exit fullscreen mode

What are the Variables In Awk

With awk, you can process text files. Awk assigns some variables for each data field found:

  • $0 for the whole line.
  • $1 for the first field.
  • $2 for the second field.
  • $n for the nth field. The whitespace character like space or tab is the default separator between fields in awk.
$ awk '{print $1}' myfile
Enter fullscreen mode Exit fullscreen mode

Image description
Sometimes the separator in some files is not space nor tab but something else. You can specify it using –F option:

$ awk -F: '{print $1}' /etc/passwd
Enter fullscreen mode Exit fullscreen mode

Image description

Using Multiple Commands
To run multiple commands, separate them with a semicolon like this:

$ echo "Hello Tom" | awk '{$2="Adam"; print $0}' 
Enter fullscreen mode Exit fullscreen mode

Image description

Reading The Script From a File

You can type your awk script in a file and specify that file using the -f option.

Our file contains this script:

{print $1 " home at " $6}
$ awk -F: -f testfile /etc/passwd
Enter fullscreen mode Exit fullscreen mode

Image description
Here we print the username and his home path from /etc/passwd, and surely the separator is specified with capital -F which is the colon.

You can your awk script file like this:

{

text = $1 " home at " $6

print text  

} 
Enter fullscreen mode Exit fullscreen mode
$ awk -F: -f testfile /etc/passwd
Enter fullscreen mode Exit fullscreen mode

Image description

Explain follwing terms from

  • Awk option
  • Awk Preprocessing and Postprocessing
  • Built-in Variables
  • User Defined Variables
  • Structured Commands
  • loop
  • Formatted Printing
  • Built-In Functions
  • String Functions
  • User Defined Functions

Awk option

$ awk options program file
Enter fullscreen mode Exit fullscreen mode

Awk can take the following options:

-F fs To specify a file separator.

-f file To specify a file that contains awk script.

-v var=value To declare a variable.

We will see how to process files and print results using awk.

Awk Preprocessing and Postprocessing

If you need to create a title or a header for your result or so. You can use the BEGIN keyword to achieve this. It runs before processing the data:

$ awk 'BEGIN {print "Report Title"}'
Enter fullscreen mode Exit fullscreen mode

Let’s apply it to something we can see the result:

$ awk 'BEGIN {print "The File Contents:"}

{print $0}' myfile
Enter fullscreen mode Exit fullscreen mode

Image description
Awk Postprocessing

$ awk 'BEGIN {print "The File Contents:"}

{print $0}

END {print "File footer"}' myfile
Enter fullscreen mode Exit fullscreen mode

Image description
This is useful, you can use it to add a footer for example.

Let’s combine them together in a script file:

BEGIN {

print "Users and thier corresponding home"

print " UserName \t HomePath"

print "___________ \t __________"

FS=":"

}

{

print $1 "  \t  " $6

}

END {

print "The end"

} 
Enter fullscreen mode Exit fullscreen mode
$ awk -f myscript  /etc/passwd
Enter fullscreen mode Exit fullscreen mode

Image description

Image description

Check this example and see how awk processes it:

How to prints every line of data from the specified file

$ awk '{print}' employee.txt
Enter fullscreen mode Exit fullscreen mode

How to prints Name and Salary fields from the specified file(two methods
Image description
** How to prints that match the given pattern from the specified file **

$ awk '/manager/ {print}' employee.txt 
Enter fullscreen mode Exit fullscreen mode

What are the Built-In Variables In Awk

NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file.
NF: NF command keeps a count of the number of fields within the current input record.
ARGC:It implies the number of arguments provided at the command line
ARGV:It is an array that stores the command-line arguments. The array's valid index ranges from 0 to ARGC-1
CONVFMT:It represents the conversion format for numbers. Its default value is %.6g
ENVIRON:It is an associative array of environment variables.
FILENAME:It represents the current file name.
FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.
RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.
OFS: OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter.
ORS: ORS command stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.

NR:
Image description
NF:
Image description
ARGV:

Top comments (0)