[awk]Remove the duplicated lines and make a sum

Object
  1. remove the duplicated lines.
  2. make a sum for each person.
Input Data (delimiters : tab)
name   job   subject   records
min     student math    100
min     student english 100
min     worker  math    100
min     worker  english 100
yong    student math    50
yong    student english 70
yong    worker  math    20
yong    worker  english 30

When we might face with problem above.
if I have to make a sum for each person, I can use awk command.

Expected Result
yong    120
min     300

make a awk file.
BEGIN {
        FS = "\t"; # input data delimiter
        OFS= "\t"; # out data delimiter
        OUTPUT_FILE="result.txt";
}
{
        #x = current count of each variable.
        x[$1]++;
        if( x[$1] == 1 ){
                name[$1] = $1;
        }
        else {
# To make a sum
                record[$1] += $4;
        }
}
END {
# To scan all elements in the array.
# n refers to Index in x. such as min, yong.
        for( n in x ){
                print name[n],record[n] > OUTPUT_FILE;
        }

}

Actually, I use the way instead of 'sort' and 'uniq' commands.
I wonder how to make the same result using both of them.

tip. How to use a variable when you use the command.
you can use the variables in the awk files.
$>awk -v var=something -f makeAsum.awk test.txt


댓글

가장 많이 본 글