[awk]Remove the duplicated lines and make a sum
Object
- remove the duplicated lines.
- make a sum for each person.
Input Data (delimiters : tab)
name job subject records
min student math 100
min student english 100
min worker math 100
min worker english 100
yong student math 50
yong student english 70
yong worker math 20
yong worker english 30
When we might face with problem above.
if I have to make a sum for each person, I can use awk command.
Expected Result
yong 120
min 300
make a awk file.
BEGIN {
FS = "\t"; # input data delimiter
OFS= "\t"; # out data delimiter
OUTPUT_FILE="result.txt";
}
{
#x = current count of each variable.
x[$1]++;
if( x[$1] == 1 ){
name[$1] = $1;
}
else {
# To make a sum
record[$1] += $4;
}
}
END {
# To scan all elements in the array.
# n refers to Index in x. such as min, yong.
for( n in x ){
print name[n],record[n] > OUTPUT_FILE;
}
}
Actually, I use the way instead of 'sort' and 'uniq' commands.
I wonder how to make the same result using both of them.
tip. How to use a variable when you use the command.
you can use the variables in the awk files.
$>awk -v var=something -f makeAsum.awk test.txt
댓글
댓글 쓰기