Sunday, May 26, 2013

What's taking all the disk space?

Finding out what's taking all your drive space can be really frustrating if you do it wrong - because it'll grind your IO subsystem almost to a halt in the process.

Instead... try this.

find / -xdev -type f -size +51200k -exec stat -c'%s|%n' {} \; | egrep -iv "/ignore/these/dirs|/var/lib/mysql|/usr/lib/locale|/var/log" | awk -F\| '{ print $1/1024/1024 "MB " ": " $2 }' | sort -nr

Admittedly, awk plays a minor part in this - "find" is doing most of the work.  But I wanted to share it anyway because of how it's using the OPTIONS to find to make the job easier on your filesystem and faster to run.

-xdev says "only scan the device that / is on".
-type f says "only look at files"
-size +512000k says "only consider files >50MB"

Instead of asking find to run "ls" or similar to find out how large the file actually is, we use "stat" and give it a custom output format with the "-c" option.  That output format comes in handy later on, when awk uses that vertical-bar ("pipe" in unix-ese) as its field separator.

Large files aren't the only thing that sucks up drive space.  Directories full of a humongous number of small files are also a problem.

To find those, the following works pretty well.

for i in `find / -xdev -type d -size +4096c -print`; do du -s $i; done|sort -n

That says, search only on the device holding /, for directories that have had to add extra directory blocks (grown above 4096 bytes)... then ask for a "disk usage" report for each of those.

Some of the directories in the resulting list may be small in terms of how much drive space they use, making you think "What's up with that?".  The reason is that directories don't shrink.  If a directory ever had to grow to accommodate many files, it'll still be "larger than 4096 bytes" even after those files are later removed.

However, the directories at the END of that list may surprise you regarding how much total space they're taking, with lots of small files.

No comments:

Post a Comment