Sunday, May 26, 2013

What us the coolest data structure?

I'll go with associative arrays.  Especially as implemented within awk.  

Although associative arrays are nowhere near as intricate or graphically stunning as some other data models, they're over-the-top-cool, because of how immensely useful they are for basic text transformation.

You can code whatever sort of transformation you want to do to "stdout" of any unix/linux command using awks associative arrays.

For example... here's a command that'll work with ALL of the maillog files - rotated or not, compressed or not, and tell you which users send/receive the largest volumes of email:

[code bash]
zgrep -h "sent=" maillog*| \
sed 's/^.*user=//'| \
sed -e 's/rcvd=//' -e  's/sent=//'| \
awk -F, '{t[$1]=t[$1]+$5+$6; r[$1]=r[$1]+$5; s[$1]=s[$1]+$6}  END {for (i in t) { print t[i]" "s[i]" "r[i]" "i}}' \
|sort -n
[/code]

Output format is:  

combined-total sent-total received-total email-address.  

Sample output:

11635906 11530222 105684 boss@somecompany.com
33077188 32995397 81791 biggerboss@somecompany.com
41524794 41225163 299631 ceo@somecompany.com
82771501 81433867 1337634 guywhodoesrealwork@somecompany.com

You could have it give you the totals in K or M by simply appending  /1024  or /1048576 to the arguments to the "print" function.

Tell me that isn't just the coolest data structure you've ever seen.  Dare ya. :-)

No comments:

Post a Comment