I'll go with associative arrays. Especially as implemented within awk.
Although associative arrays are nowhere near as intricate or graphically stunning as some other data models, they're over-the-top-cool, because of how immensely useful they are for basic text transformation.
You can code whatever sort of transformation you want to do to "stdout" of any unix/linux command using awks associative arrays.
For example... here's a command that'll work with ALL of the maillog files - rotated or not, compressed or not, and tell you which users send/receive the largest volumes of email:
[code bash]
zgrep -h "sent=" maillog*| \
sed 's/^.*user=//'| \
sed -e 's/rcvd=//' -e 's/sent=//'| \
awk -F, '{t[$1]=t[$1]+$5+$6; r[$1]=r[$1]+$5; s[$1]=s[$1]+$6} END {for (i in t) { print t[i]" "s[i]" "r[i]" "i}}' \
|sort -n
[/code]
Output format is:
combined-total sent-total received-total email-address.
Sample output:
11635906 11530222 105684 boss@somecompany.com
33077188 32995397 81791 biggerboss@somecompany.com
41524794 41225163 299631 ceo@somecompany.com
82771501 81433867 1337634 guywhodoesrealwork@somecompany.com
You could have it give you the totals in K or M by simply appending /1024 or /1048576 to the arguments to the "print" function.
Tell me that isn't just the coolest data structure you've ever seen. Dare ya. :-)
No comments:
Post a Comment