Count the number of files and folder’s depth with ‘find’

3 minute read  

Sometimes you want to find out the statistic of your files like I needed to figure out why some File Synchronization Services didn’t work well. Typical questions are:

  1. How many files are there in a folder?
  2. How many of those are regular files?
  3. How many of those are sub-folders?
  4. What’s the maximum number of nested folders?

By default, Windows and Mac have their built-in Files Manager that give answers to some of these questions. File Explorer on Windows can answer #1, #2, #3 by going to Properties. Finder on Mac can answer #1 by viewing Get Info. Mac doesn’t have a way to find out #2 and #3 with the built-in Finder. And none of the default Files Manager on Windows nor Mac can answer #4.

find can. It is a UNIX command line tool that searches for files in a directory hierarchy. find is native in Linux, and Mac and Windows have their own version of it. However there are discrepancies among the versions that will trip up users.

Demo folder for illustration

# Make a demo folder
$ mkdir demo-find
$ cd demo-find/

# Make 2 main folders, one with nested sub-folders inside
$ mkdir -p d1/d11/d111 d1/d12 d2

# Create files
$ touch f1.txt f2.txt d1/f3.txt d1/d11/f4.txt d1/d11/d111/f5.txt d1/d11/d111/f6.txt d1/d12/f7.txt d2/f8.txt

# We'll then have this
demo-find/
├── d1
│   ├── d11
│   │   ├── d111
│   │   │   ├── f5.txt
│   │   │   └── f6.txt
│   │   └── f4.txt
│   ├── d12
│   │   └── f7.txt
│   └── f3.txt
├── d2
│   └── f8.txt
├── f1.txt
└── f2.txt
5 directories, 8 files

FIND FILES

find takes the form of

find [PATH] [OPTION]

Find all files

# This will list all files in the given folder, one per line
# This is not yet what we want
# Don't forget the trailing slash /
$ find demo-find/

demo-find/
demo-find/d1
demo-find/d1/d11
demo-find/d1/d11/d111
demo-find/d1/d11/d111/f5.txt
demo-find/d1/d11/d111/f6.txt
demo-find/d1/d11/f4.txt
demo-find/d1/d12
demo-find/d1/d12/f7.txt
demo-find/d1/f3.txt
demo-find/d2
demo-find/d2/f8.txt
demo-find/f1.txt
demo-find/f2.txt

Count the number of all files: pipe find to wc -l

# This will also count the root 'demo-find' folder as 1
$ find demo-find/ | wc -l
    14

Count the number of regular files

$ find demo-find/ -type f | wc -l
    8

Count the number of directories

# This will also count the root 'demo-find' folder as 1
$ find demo-find/ -type d | wc -l
    6

FIND THE MAXIMUM NUMBER OF NESTED FOLDERS

According to find’s manpage we can use the option -printf with format %d to print out the number of nested folders for a certain filepath.

-printf format

%d  File's depth in the directory tree; 0 means the file is a starting-point.
%p  File's name.

Let’s try it on our Mac:

$ find demo-find/ -type d -printf '%d:%p\n'
find: -printf: unknown primary or operator

Great, we’ve got an error -printf: unknown primary or operator. How come?

It turns out that MacOS is based on BSD UNIX and the find implementation on BSD doesn’t have the -printf option, in contrast with GNU find found on Linux nowadays which has many features added such as -printf. Similar story for Windows where FIND is Windows’ own implementation which doesn’t support -printf.

Get GNU find

Depending on your OS there are different way to acquire GNU find on your system. It’s part of the package findutils so we need to install findutils to get find. On Mac, use homebrew or MacPorts. On Windows, try Cygwin.

We’ll use brew on Mac:

# Install 'findutils' to get 'find'
$ brew install findutils

# Edit PATH to make sure our terminal use GNU's find instead of Apple's find
# Edit .bashrc or .bash_profile
export PATH="/usr/local/opt/findutils/libexec/gnubin:$PATH"
export MANPATH="/usr/local/opt/findutils/libexec/gnuman:$MANPATH"

# Restart the terminal or source the bash file
$ source .bashrc

# Make sure 'find' is now the GNU version
# Make sure the path is NOT /usr/bin/find which is the default Apple BSD version
$ which find
/usr/local/opt/findutils/libexec/gnubin/find

Count max folder’s depth

find with -printf should now work:

$ find demo-find/ -type d -printf '%d:%p\n'
0:demo-find/
1:demo-find/d1
2:demo-find/d1/d11
3:demo-find/d1/d11/d111
2:demo-find/d1/d12
1:demo-find/d2

We can see that 3 is the maximum number of folder’s depth.

Count max depth: pipe it through sort and tail

$ find demo-find/ -type d -printf '%d:%p\n' | sort -n | tail -1
3:demo-find/d1/d11/d111

And there we have it, 3 is the max depth. Sometimes the filepath may be very long, so we can just get the folder’s depth by dropping the %p parameter:

$ find demo-find/ -type d -printf '%d\n' | sort -n | tail -1
3

Leave a Comment